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(57) Abstract: The present invention identifies that the expression of Activation Induced Deaminase (AID) or its homologues in 
Q cells confers a mutator phenotype and thus provides a method for generating diversity in a gene or gene product as well as cell lines 

capable of generating diversity in defined gene products. The invention also provides methods of modulating a mutator phenotype 
^ by modulating AID expression or activity. 
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Activation Induced Deaminase (AID) 

The present invention identifies that the expression of Activation Induced Deaminase 
(AID) and its homologues, such as Apobec, in cells confers a mutator phenotype and thus 
5 provides a method for generating diversity in a gene or gene product as well as cell lines 
capable of generating diversity in defined gene products. The mvention also provides 
methods of modulating a mutator phenotype by modulating the expression or activity of 
AID or its homologues. 

10 BACKGROUND 

In normal cells, a low mutation rate ensures genetic stability and this depends on effective 
DNA repair mechanisms for repauing the many accidental changes that occur continually 
inDNA. 

15 

However, during the generation of antibodies, point mutations occur within the V-region 
coding sequence of the antigen receptor loci and the rate of mutation observed, called 
somatic hypermutation, is about a million times greater than the spontaneous mutation 
rate in other genes. The antigen receptor loci are the only loci in human cells that undergo 
20 programmed genetic alterations. However, the mechanisms that allow the nucleotide 
changes to be controlled and targeted to the DNA of a precisely specified, part of the 
genome in this way is not known. 

Functional antigen receptors are assembled by RAG-mediated gene rearrangement and the 
25 isotype switch firom IgM to Igfj, IgA and IgE is effected by class switch recombination. 
Aberrant forms of RAG-mediated gene rearrangement and class switch recombination 
have been shown to underpin many of the chromosomal translocations associated with 
lymphoid malignancies. In the case of somatic hypermutation, it was proposed several 
years ago by Rabbitts et al (1984 Nature 309, 592-597) that the chromosomal 
30 translocations which bring the c-myc proto-oncogene into the vicinity of the IgH locus 
could make it a substrate for the antibody hypermutation mechanism. Recent evidence 
using hypermutating cell lines has provided evidence in support of this (Bemark, M and 
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Neuberger, M.S, 2000 Oncogene 19, 3404-3410). A wider role for aberrant hypermutation 
came with the finding that several genes apart from the immunoglobulin V genes can 
(without being translocated into the Ig loci) apparently act as substrates for the antibody 
hypermutation mechanism in that they exhibit an increased frequency of point mutation in 
5 hypermutating B cells. Recent evidence also points to a high frequency of mutations in 
many B cell tumours and it has been proposed that this is a result of a transient 
hypemiutation phase caused by the antibody hypennutation mechanism. In all these 
cases, the aberrant mutations are largely at dC/dG residues. 

10 An uncontrolled and enhanced rate of mutation in non-antibody producing cells can also 
be deleterious. For example, mutations are the hallmark of cancer and the enhanced rate 
of mutation in cancer cells may explain their capability to continually grow and evade Uie 
nonnal human defences. The **mutator phenotype" hypothesis attributes this phenomenon 
to an increasing rate of enrors in DNA repUcation as a tumour grows. According to this 

15 theory, genes encoding proteins normally interacting with nucleotides such as DNA 
polymerases and DNA repair enzymes may be faulty in cancer cells and therefore cause 
subsequent mutations. 

In vitro, understanding and harnessing the means for controlling an enhanced rate of 
20 mutation can be usefully employed, for example, in generating diversity of gene products 
such as generating antibody diversity. 

Many in vitro approaches to the generation of diversity in gene products rely on the 
generation of a very large number of mutants which are then selected using powerful 

25 selection technologies. For example, phage display technology has been highly successfiil 
as providing a vehicle that allows for the selection of a displayed protein (Smith, G.P. 1985 
Science, 228, 1315-7; Bass et al Proteins, 8, 309-314, 1990; McCaflferty et al, 1990 
Nature, 348, 552-4; for review see Clackson and Wells, 1994 Trends Biotechnol 12, 173- 
84). Similarly, specific peptide ligands have been selected for binding to receptors by 

30 affinity selection using large libraries of peptides linked to the C terminus of the lac 
repressor Lacl (Cull et al., 1992 Proc Natl Acad Sci U S A, 89, 1865-9), When expressed in 
E. coli the repressor protein physically links the ligand to the encoding plasmid by binding 
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to a lac operator sequence on the plasmid. Moreover, an entirely in vitro polysome display 
system has also been reported (Mattheakis et al, 1994 Proc Natl Acad Sci USA, 91, 9022- 
6) in which nascent pq>tides are physically attached via the ribosome to the RNA which 
racodes them. 



Artificial selection systems to date rely heavily on initial mutation and selection, similar 
in concept to the initial phase DNA rearrangement involving the joining of 
immunoglobulin V, D and J gene segments which occurs in natural antibody production, 
in that it results in the generation of a *'fixed" repertoire of gene product mutants fiiom 
1 0 which gene products having the desired activity may be selected. 

Unlike in the natural immune system, however, artificial selection systems are poorly 
suited to any facile form of "affinity maturation", or cyclical steps of repertoire generation 
and development. One of the reasons for this is that it is difficult to generate enough 
15 mutations and to target these to regions of the molecule where they are required, so 
subsequent cycles of mutation and selection do not lead to the isolation of molecules with 
improved activity with sufficient efSciency. 

In vivo, after the primary repertoire of antibody specificities is created by V-D-J 
20 rearrangement, and following antigen encounter in mouse and man, the rearranged V 
genes in those B cells that have been triggered by the antigen are subjected to two fiirther 
types of genetic modification. Class switch recombination, a region-specific but largely 
non-homologous recombination process, leads to an isotype change in the constant region 
of the expressed antibody. Somatic hypermutation introduces multiple single nucleotide 
23 substitutions in and around the rearranged V gene segments. This hypermutation 
generates the secondary repertoire from which good binding specificities can be selected 
thereby allowing affinity maturation of the hxmioral immune response. In chicken and 
rabbits (but not man or mouse) an additional mechanism, gene conversion, is a major 
contributor to V gene diversification. 



5 



30 



Much of what is known about the somatic hypermutation process which occurs during 
affinity maturation in natural antibody production has been derived from an analysis of 
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the mutations that have occurred during hypennutation in vivo (for reviews see Neuberger 
and Milstein, 1995 Curr. Opin. Immunol. 7, 248-254; Weill and Reynaud, 1996 Immunol 
Today 17, 92-97; Parham, 1998 Immunological Reviews, Vol. 162 (Copenhagen, 
Denmark: Munksgaard)). Most of these mutations are single nucleotide substitutions 

5 which are introduced in a stepwise manner. They are scattered over the rearranged V 
domain, though with characteristic hotspots, and the substitutions exhibit a bias for base 
transitions. The mutations largely accumulate during B cell expansion in genninal centres 
(rather than during other stages of B cell differentiation and proliferation) with the rate of 
mcorporation of nucleotide substitutions into the V gene during the hypermutation phase 

10 estimated at between 10^ and 10-3 bp-1 generation"! (McKean et al, 1984; Berek & 
Milstein, 1988). However, a greater understanding of the steps involved in these later 
stages of hypermutation would enable a more diverse range of gene products to be 
obtained. 

15 All three of the above processes, somatic hypermutation, gene conversion and class- 
switch recombination, have been shown to depend upon activity of the protein Activation 
Induced Deaminase (AID) (Muramatsu et al. (1999); Muramatsu M. et al. (2000); Revy, 
P. et al. (2000); Arakawa, H. et al. (2002); Hanris, R.S. et al. (2002); Martin, A. et al. 
(2002) and Okazaki, L et al. (2002)) which has been suggested (by virtue of its homology 

20 with Apobec-1 (Muramatsu et al. (1999)) to act by RNA editing. However, evidence that 
the three processes could be initiated by a common type of DNA lesion (Maizels et al. 
(1995); Weill et al. (1996); Sale et al. (2001); Ehienstein et al. (1999)) taken with the fact 
that first phase of hypermutation targets dG/dC (Martin et al. (2002); Rada et al. (1998); 
Wiesendanger et al. (2000)) has suggested that AID may act directly on dG/dC pairs in the 

25 immunoglobulin locus. However, to date, the actual function of AID has not been 
described. 

The AID homologue Apobec-1 has been identified as playing a role in modifying RNA. 
Apobec-1 is a catalytic component of the apolipoprotein B (apoB) RNA editing complex 
30 that performs the deamination of C6666 to U in intestinal apoB RNA thereby generating a 
premature stop codon. hdeed, the oncogenic activity of Apobec-1, identified by its 
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overexpression in transgenic mice, has previously been attributed to its RNA editing 
activity acting on inappropriate substrates. 



Deamination of cytosine to uracil can occur in vivo at the level of nucleotide and in DNA 
5 as well as RNA. In the context of DNA, the low level deamination of cytosine to uracil 
which takes place spontaneously (and which might be of relatively minor significance 
when it occurs with free nucleotides or in mRNA) can have major effects, contributing to 
genome mutation, cancer and evolution (Lindahl, T. (1993) Nature 362, 709-715). 
However, to date, there is no biochemical evidence that APOBEC family members can 
1 0 trigger such deamination in vitro. 



Summary of the Invention 



15 The present inventors have demonstrated that expression of AID in Escherichia coli gives 
a mutator phenotype yielding DNA nucleotide transitions at dG/dC. The mutation 
frequency is enhanced by deficiency in uracil-DNA glycosylase indicating that AID acts 
by deaminating dC residues in DNA. 

20 hi addition, the expression of AID homologues, Apobec-1, Apobec3C and ApobecSG, 
including their expression as part of a fusion protein in E.colh also yields a mutator 
phenotype and these homologues show an increased potency of mutator activity on DNA 
sequences whm compared to AID. 



25 Furthemiore, deamination of cytosine to uracil in DNA can be achieved in vitro using 
partially purified APOBEC 1 from extracts of transformed Escherichia coli. Its activity on 
DNA is specific for single-stranded DNA and exhibits dependence on local sequence 
context. 

30 Accordingly, in a first aspect of the invention there is provided a cell modified to express 
AID, or an AID variant, derivative or homologue, and having a mutator phenotype. 




# 
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Suitably, the cell is modified to stably express AID, or an AID variant, derivative or 
homologue, and having a mutator phenotype. 

5 By **stable expression" of a gene is meant that the gene and its expression is substantially 
maintained in successive generations of cells derived from transfected cells. In particular, 
the term "stable expression" is not intended to encompass the transient expression of a 
protein in a bacterial cell for the purpose of protein purification. 

10 In another embodiment, the cell is transiently transfected to express AID, or an AID 
variant, d^vative or homologue, and having a mutator phenotype. 

As used herein, ""mutator phenotype" means an increased mutation frequency in the 
transfected cells modified to express AID or its homologues when compared to non- 
15 modified, non-transfected cells. Methods for measuring mutation frequency are described 
herein. Suitably the mutations are nucleotide transitions at dG/dC as a result of 
deamination of dC residues in DNA. The term "mutator activity** refers to the activity that 
confers the mutator phenotype. 

20 In one embodiment, said cell is a prokaiyotic cell, such as bacteria. Suitable bacteria 
iachxde Kcoli, 

In another embodiment, the cell is a modified eukaryotic cell in which altered AID 
expression has been induced by introduction of AID gene with the proviso that said 
25 eukaryotic cell is not a cell of the human B lymphocyte lineage and, in particular, is not a 
human Ramos, BL-2 or CL-01 cell nor a cell derived from the chicken cell line, DT40. 
Suitably said cell is derived from mouse or man and is capable of generating 
immunoglobulin diversity through somatic hypermutation or class switching. 



30 In another embodiment, the AID homologue is Apobec and is, in particular, selected from 
Apobec family members such as Apobec- 1, Apobec3C or ApobecSG (described, for 
example, by Jarmuz et al (2002)). 
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In yet another embodiment, the AID variant is a fusion protein. Suitably, said fusion 
protein is AID, Apobec-1, Apobec3C or ApobecSG in which a heterologous protein or 
peptide domain has been fused at either its N- or C- terminus. Preferably, the heterologous 
5 peptide is fused at die amino terminus. Suitably, said heterologous peptide domain is a 
binding domain which is one half of a specific binding pair which can mteract witfi the 
second half of said pair to form a complex. Suitable binding pairs include two 
complementary components which can bind in a specific binding reaction. Examples of 
specific binding pairs mclude His-tag - Nickel, DNA binding domain - DNA binding 
1 0 domain recognition sequrace, antibody - antigen, Biotin - Streptavidin etc. 

The data presented herein are consistent with AID or its homologues activating 
deamination of dC as an enhancement of the effect is observed in cells lacking uracil- 
DNA glycosylase (UDG). 



Accordingly, in another embodiment, said cell further comprises a genetic background 
which confers an enhanced mutator phenotype effect. In a particularly prefeired 
embodiment, the genetic background of a prokaryotic cell confers a UDG deficiency on 
the cell. Said UDG deficiency is preferably induced by interfering with UDG expression 
20 such as, for example, creating a ung- background. In some E. Coli ung-J mutants, some 
back up UDG activity is provided by the product of the mug gene. Thus, in a further 
embodiment, the cell comprises a combined background of ung- and mug-. 

The introduction of modified expression of AID or an AID homologue into a cell can 
25 increase the mutation rate above the background mutation rate that would normally be 
observed in that cell. Suitably, the modified cell is capable of generating mutations in a 
defined gene product. This can be particularly useful in the generation of gene diversity 
for example in the generation of antibody diversity where the defined gene product is an 
immunoglobulin V region gene. 



15 
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Such cells according to any embodiment of the first aspect and displaying an enhanced 
rate of mutation can be useful in a method for preparing a gene product having a desired 
activity. 

5 Preferably the gene product which is desired to mutate is provided to AH) or its 
homologues as single-stranded DNA. Single stranded DNA may be provided by 
introducing single stranded DNA directly or by introducing double stranded DNA which 
is later converted to single stranded, for example, through enzymatic action such as 
helicase or transcriptase activity. 

10 

In another aspect of the invention, there is provided a fusion protein comprising an AID, 
or AID variant, derivative or homologue, polypeptide having a mutator phenotype 
operably linked to one half of a specific binding pair. 

15 The term "operably linked" refers to a juxtaposition wherein the components described 
are in a relationship permitting them to function in their intended manner. For example, 
an AID polypeptide "operably linked** to one half of a specific binding pair is linked 
through ligation of the nucleic acid coding sequences or otherwise such that a fusion 
protein is produced in which the mutator activity of AID is unimpaired whilst allowing 

20 the specific binding pair to form through interaction of the said one half with its 
complement. 

In a preferred embodiment, the one half of the specific binding pair m said fusion protein 
is a DNA binding domain. 

25 

Preferably, the AID homologue is one of the Apobec family of proteins and, suitably, is 
selected firom the group consisting of Apobec-1, Apobec-3G and Apobec-3C. 

In another aspect of the invention, there is provided a vector for expressing a fusion 
30 protein in accordance with the previous aspect. 
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In yet another aspect of the invention, there is provided a cell modified to express a fusion 
protein in accordance with that aspect of the invention. 

The mutator activity of AID can be harnessed to drive mutation of specific gene products 
5 of interest Accordingly, in a fiirther aspect of the invention there is provided a method for 
preparing a gene product having a desired activity, comprising the steps of: 

a) expressing a nucleic acid encoding the gene product in a population of cells 
according to the invention; 

b) identifying a cell or cells within the population of cells which expresses a mutant 
10 gene product having the desired activity; and 

c) establishing one or more clonal populations of cells firom the cell or cells 
identified in step (b), and selecting fi^om said clonal populations a cell or cells 
which expresses a gene product having an improved desired activity. 

IS In one embodiment, the nucleic acid encoding the gene product is available to AID or an 
AID homologue as single-stranded DNA. 

Suitably, the nucleic acid encoding the gene product is operably linked to one component 
of a specific binding pair. In this embodiment, a nucleic acid operably linked to the one 

20 component, or second half, of a specific binding pair is ligated in such a way that the 
binding of the other component, or first half, of a specific binding pair can take place. 
Thus, where the first half of specific binding pair is linked in a fiision protein to the AID 
polypeptide having mutator activity, binding of the first and second halves of the specific 
binding pairs brings the mutator protein into range with the nucleic acid sequence such 

25 that directed mutation of that particular nucleic acid sequence can take place. 

In a particularly preferred embodiment, the specific binding pair is a DNA binding protein 
- DNA binding protein recognition sequence. In this embodiment, the population of cells 
comprises cells expressing a fusion protein being a fusion of AID polypeptide to a DNA 
30 binding protein (or DNA binding domain) and the nucleic acid sequence encoding the 
gene product is operably linked to the DNA binding protein recognition sequence. This 
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would allow the mutator activity of AID or its homologues to be specifically directed to 
the nucleic acid encoding the gene product of interest 

Accoidingly, in another aspect of the invention, there is provided a method for directing 
S mutation to a specific geac product of interest Suitably said method comprises the steps 
of: 

i) generating a nucleic acid construct comprising a nucleic acid sequence encoding a 
gene product operably linked to a DNA binding protein recognition sequence; 

ii) transfecting said nucleic acid construct into a population of host cells expressing a 
1 0 fiision protem in accordance with the invention; 

iii) incubating said transfected host cells under conditions suitable for allowing the 
specific binding pairing of DNA binding protein to DNA binding protein 
recognition sequence to occ\ir, and 

iv) identifying a cell or cells within the population of cells which expresses a mutant 
1 5 gene product having the desired activity; and 

v) establishing one or more clonal populations of cells bom the cell or cells 
identified in step (iv), and selecting from said clonal populations a cell or cells 
which expresses a gene product having an improved desked activity. 

20 Suitably said host cells may be prokaryotic, bacterial cells such as E. Coli or they may be 
eukaryotic cells such as yeast or mammalian cells. 

In one embodiment, the population of cells in accordance with the invention is derived 
fit>m a clonal or polyclonal population of cells which comprises cells capable of 
25 constitutive hypermutation of V region genes. 

The gene product may be an endogenous gene product such as the endogenous 
immunoglobulin polypeptide, a gene product expressed by a manipulated endogenous 
gene or a gene product expressed by a heterologous transcription unit operatively linked 
30 to control sequences which direct somatic hypermutation, as described further below. In 
this embodiment, the gene product is operably linked to a nucleic acid which directs 
hypermutation. 
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Alternatively, the gene product may be a heterologous gene product. 

The nucleic acid which is expressed in the cells of the invention and subjected to 
S hypermutation may be an endogenous region, such as the endogenous V region, or a 
heterologous region inserted into the cell line of the invention. This may take the form, 
for example, of a replacement of the endogenous V region with heterologous transcription 
unit(s), such as a heterologous V region, retaining the endogenous control sequences 
which direct hypermutation; or of the insertion into the cell of a heterologous transcription 
10 unit under the control of its own control sequences to direct hypermutation, wherein the 
transcription unit may encode V region genes or any other desired gene product The 
nucleic acid according to the invention is described in more detail below. 

In another embodiment the gene product may be an endogenous gene product which is not 
15 normally subject to hypermutation. Suitable gene products include genes imphcated in 
disease, oncogenes and other target genes. Thus, the gene product may be any gene 
product in which mutation is desirable. 

In one embodiment, the endogenous or heterologous gene may be integrated into a 
20 chromosome. 

In step b) or step (iv) above, the cells are screened for the desired gene product activity. 
This may be, for example in the case of inmiunoglobulins, a binding activity. Other 
activities may also be assessed, such as enzymatic activities or the like, using appropriate 

25 assay procedures. Where the gene product is displayed on the surface of the cell, cells 
which produce the desired activity may be isolated by detection of the activity on the cell 
surface, for example by fluorescence, or by inunobilising the cell to a substrate via the 
surface gene product. Where the activity is secreted into the growth medium, or 
otherwise assessable only for the entire cell culture as opposed to in each individual cell, 

30 it is advantageous to establish a plurality of clonal populations from step a) in order to 
increase the probability of identifying a cell which secretes a gene product having the 
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desired activity. Advantageously, the selection system employed does not affect the cell's 
ability to proliferate and mutate. 

Preferably, at this stage (and in step c) or step v)) cells which express gene products 
5 having a better, improved or more desirable activity are selected. Such an activity is, for 
example, a higher afBnity binding for a given ligand, or a more effective enzymatic 
activity. Thus, the method allows for selection of cells on the basis of a qualitative and/or 
quantitative assessment of the desired activity. Successive rounds of selection may allow 
for directed evolution in a gene product Selection of mutants may also be achieved by 
10 growth or selection on selective media as described herein. 

In a preferred embodiment, tiie '^pulation of cells'* in the method is a population of 
prokaryotic cells. In another embodiment, the 'population of cells" is a population of 
yeast cells. 

15 

The targeted mutation of a specific gene product of interest can be enhanced by providing 
the nucleic acid encoding the gene product in a modified construct. Suitably the construct 
is arranged such to favour generation of a single-stranded substrate oligonucleotide (i.e. 
the nucleic acid encoding the gene product of interest). An increased availability of single 
20 stranded DNA can be achieved by providing the substrate oligonucleotide between two 
convergent promoters. In one embodiment, this construct favours the generation of single 
stranded DNA ttirough DNA bending caused by promoter activity. In another 
embodiment, this construct favours single stranded DNA through bi-directional 
transcription activation. 

25 

Accordingly, in another aspect of the invention thore is provided a construct for use in a 
method in accordance with the invention said construct comprising a nucleic acid 
encoding the gene product of interest wherein said nucleic acid is placed under the control 
of a first promoter upstream of the coding sequence and further comprising a second 
30 promoter downstream of the coding sequence in the opposite orientation. Such a construct 
may be referred to as a construct for convergent transcription. 
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A number of suitable promoter sequences are known to those skilled in the art. For 
example, suitable Prokaryotic promoters include Activators such as AraBAD, PhoA, 
Repressors such as Tet, Lac, Trp, Hybrid Lac/Trp such as Tac, pL and Regulatable 
hybrids of pL such as pL-tet or Viral Polymerase, such as T7. Suitable Eukaryotic 
5 promoters include, for example, RNA Polymerase I (e.g. 45S rDNA), RNA Polymerase H 
(e.g. Gal4, P-Actin, Viral promoters, such as CMV-BE and Artificial promoters including 
Tet-on, Tet-off) or RNA Polymerase m promoters including HI RNA and U6 snRNA. In 
particular, promoters include the PhoB promoter and inducible promoters such as IPTG 
inducible Trc promoter. Suitably said construct is as described in the examples section 
10 herein. 

In another aspect of the invention there is provided a method of identifying components of 
AID-dependant mutation activity comprising expressing AID in a cell deficient in a 
particular gene and assessing mutator activity compared to activity in a cell expressing 
IS said gene. 

By "components of AID dependant mutation activity*' is meant aspects or cellular 
components which contribute to the molecular role of AID (or its homologues) and 
includes proteins or nucleic acid components which interact with AID in its mutator 
20 fimction. 

In a fiirther aspect of the invention there is provided a method of screening for a 
modulator of AID activity comprising: 

- expressing AID in a prokaryotic cell; 

25 - mamtaining the AID-expressing prokaryotic cell in the presence of a selectable 
medium; 

- detecting the presence of colonies in the absence or presence of a test compound 
wherein a modified number of colonies when compared to a sample in the absence of 
a test compound is indicative of the ability of the test compound to modify AID 

30 mutator activity. 

By "AID activity*' is meant activity of AID or any of its homologues. 
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Preferably, the modified number of colonies in the presence of the test conqx>und is an 
increased number and is therefore indicative of enhanced AID-mediated mutation. 

5 In another aspect of the invention there is provided a method of conferring a mutator 
phenotype on a cell conq)rising expressing AID or its homologues in a cell. 

Modifying a cell to confer an increased frequency of mutations by introducing AID 
expression is equivalent to a method of introducing mutations into a cell comprising 
1 0 expressing AID, the mutator protein. 

hi another aspect of the invention, there is provided a use of AID or a functional 
homologue thereof in triggering mutation in a cell. In particular, there is provided a use of 
AID to introduce nucleotide transitions at dG/dC as a result of deamination of dC residues 
15 inDNA. 

There are several members of the AID/apobec/phorbolm family in humans (Jaimuz et al. 
(2002)). Indeed, overexpression of Apobec-1 is oncogenic in mice (Yamanaka S. et al. 
(1995)) and Apobec-1 family members are expressed in many tumour cell lines. The 
20 mutator activity demonstrate herein provides a molecular explanation for the mechanism 
for this oncogenesis. Tumour cells generally show an enhanced rate of mutation compared 
with non-tumour cells with mutations at dC/dG being the most common nucleoside 
substitutions. Thus, the ability to modulate gene products that trigger mutation provides a 
method of treating disorders characterised by an increased mutation rate, such as cancer. 

25 

Accordingly, in another aspect of the invention there is provided a method for treating a 
disorder characterised by increased mutations comprising treating an individual having 
such a disorder with an agent that modifies AID or AID homologue functional activity or 
gene expression. Suitably the disorder is selected from cancer, autoimmune disease or 
30 other disorders in which increased mutations are correlated with the disease phenotype. 
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In one embodiment said treatment may be prophylactic i.e. a preventative treatment. This 
is particularly jqjplicable to treatment of an individual that may be predisposed to the 
development of a specific disorder. For exan^>le, an individual may be predisposed to 
develop a cancer through, for example, overexpression of AID or its homologues. In such 
S an individual prophylactic treatment with an agent that modifies AID or AID homologue 
functional activity or gene expression may act to prevent the condition developing. 

In a preferred embodiment of this aspect, the AID homologue is Apobec-1, Apobec-3G or 
Apobec-3C. 

10 

The development of resistance to antibiotics by a population of bacteria is a problem in 
treatment of everyday infections. The ability to decrease the rate at which mutations 
conferring the development of antibiotic resistance would be desirable. Understanding the 
role of AID in generating mutations along with the observation that bacterial cells express 

15 proteins having a similar activity to AID (see, for example, Shen et al. (1992); 
Navaratnam et al. (1998)) enables modification of an AID-like mutator activity in bacteria 
to modify the rate at which antibiotic resistance arises. Accordingly in another aspect of 
the invention there is provided a method of decreasing hypermutation/resistance to a 
compomid such as an antibiotic in a population of bacteria by modulating bacterial AID- 

20 like activity. 

DEFINITIONS 

25 Unless defined otherwise, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art (e.g., in cell culture, 
molecular genetics, nucleic acid chemistry, hybridisation techniques and biochemistry). 
Standard techniques are used for molecular, genetic and biochemical methods. See, 
generally, Sambrook et al. Molecular Cloning: A Laboratory Manual, 2d ed. (1989) Cold 

30 Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. and Ausubel et al. Short 
Protocols in Molecular Biology (1999) 4* Ed, John Wiley & Sons, Inc.; as well as 
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Guthrie et al., Guide to Yeast Genetics and Molecular Biology, Methods in Enzymology, 
Vol. 194, Academic Press, Inc., (1991), PGR Protocols: A Guide to Methods and 
Applications (limis, et al. 1990. Academic Press, San Diego, Calif.), McPherson et al., 
PGR Volume 1, Oxford University Press, (1991), Culture of Animal Cells: A Manual of 
5 Basic Technique, 2nd Ed. (R. L Freshney. 1987. Liss, Inc. New York, N.Y.), and Gene 
Transfer and Expression Protocols, pp. 109-128, ed. E. J. Murray, The Humana Press hic, 
Cliflon, N. J.). These documents are incorporated herein by reference. 

The abbreviations used herein include: APOBECl, apolipoprotein B editing complex 
10 catalytic subunit 1; AID, activation-induced deaminase; TLC, thin-layer chromatographjr, 
PEI, polyethylene imine; UDG, uracil-DNA glycosylase. 

The terms "variant" or "derivative" in relation to AID polypeptide includes any 
substitution of, variation of, modification of, replacement of, deletion of or addition of 
15 one (or more) amino acids ftom or to the polypeptide sequence of AID. Preferably, 
nucleic acids encoding AID are understood to comprise variants or derivatives thereof. 

Such ^^modifications" of AID polypeptides include fusion proteins in which AID 
polypeptide or a portion or fragment thereof is linked to or fused to another polypeptide or 
20 molecule. 

The term '*homologue" as used herein with respect to the nucleotide sequence and the amino 
acid sequence of AID may be synonymous with allelic variations in die AID sequences and 
includes the known homologues, for example, Apobec-1 and other Apobec homologues 
25 includmg Apobec3C, Apobec3G, phorbolin and functional homologues thereof. 

The "functional activity" of a protein in the context of the present invention describes the 
function the protein performs in its native envm>nment. Altering or modulating the 
functional activity of a protein includes within its scope increasing, decreasing or 
30 otherwise altering the native activity of the protein itself. In addition, it also includes 
within its scope increasing or decreasing the level of expression and/or altering the 
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intracellular distribution of the nucleic acid encoding the protein, and/or altering the 
intracellular distribution of the protein itself. By "AID mutation activity*' or ''mutator 
activity" is meant the functional activity of AID or its homologues to increase mutation 
above backgroimd. 

5 

The t^m '"expression" refers to the transcription of a genes DNA template to produce the 
corresponding mRNA and translation of this mRNA to produce the corresponding gene 
product (i*e., a pq>tide, polypeptide, or protein). The term "* activates gene expression'* 
refers to inducing or increasing die transcription of a gene in response to a treatment 
10 where such induction or increase is compared to the amount of gene expression in the 
absence of said treatment Similarly, the terms "decreases gene expression'* or "down- 
regulates gene expression" refers to inhibiting or blocking the transcription of a gene in 
response to a treatment and where such decrease or doAvn-regulation is compared to the 
amount of gene expression in the absence of said treatment. 

15 

The "mutation rate" is the rate at which a particular mutation occurs, usually given as the 
number of events per gene per generation whereas "mutation frequency" is the frequency 
at which a particular mutant is found in the population. 

20 ^Bypennutation" or "increased mutation rate" or "increased mutation frequency" refers to 
the mutation of a nucleic acid in a cell at a rate above background. Preferably, 
hypeimutation refers to a rate of mutation of between 10"^ and 10'^ bp"^ generation ^ This 
is greatly in excess of background mutation rates, which are of the order of 10"^ to 10'^^ 
mutations bp'^ generation^ (Drake et aL, 1998 Genetics 148:1667-1686) and of 

25 spontaneous mutations observed in PGR. 30 cycles of amplification with Pfu polymerase 
would produce <0.05xlO''^ mutations bp'^ in the product, which in the present case would 
account for less than 1 in ICQ of the observed mutations (Lundberg et al, 1991 Gene 
108:1-6). 

30 In vivo, hypermutation is a part of the natural generation of immunoglobulin diversity 
through gen^ting variable chain (V) genes. According to one aspect of the present 
invention therefore, the cell line is preferably an immunoglobulin-producing cell line 
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which is capable of producing at least one immunoglobulin V gene. A V gene may be a 
variable light chain (Vl) or variable heavy chain (Vh) gene, and may be produced as part 
of an entire immunoglobulin molecule; it may be a V gene from an antibody, a T-cell 
recq)tor or another member of the immunoglobulin superfamily. Members of the 
S immunoglobulin superfamily are involved in many aspects of cellular and non-cellular 
interactions in vivo, including widespread roles in the immune system (for example^ 
antibodies, T-cell receptor molecules and the like), involv^ent in cell adhesion (for 
example the ICAM molecules) and intracellular signalling (for example, receptor 
molecules, such as the PDGF receptor). Thus, preferred cell lines according to the 
10 invention are derived from B-cells. According to the present invention, it has been 
determined that cell lines derived from antibody-producing B cells may be isolated which 
retain the ability to hypermutate V region genes, yet do not hypermutate other genes. 

"Class switching" or "switch recombination" is the recombination process in V gene 
15 rearrangement that leads to a change in the constant region of the expressed antibody. 
"Gene conversion" is an additional mechanism in the recombination process which is 
found to occur in chicken and rabbits (but not in human or mouse) and contributes to V 
gene diversification. 

20 The term "constitutive hypermutation" refers to the ability of certain cell lines to cause 
alteration of die nucleic acid sequence of one or more specific sections of endogenous or 
transgene DNA in a constitutive manner, that is without the requirement for external 
stimulation. Generally, such hypermutation is directed. In cells capable of directed 
constitutive hypermutation, sequences outside of the specific sections of mdogenous or 

25 transgene DNA are not subjected to mutation rates above background mutation rates. The 
sequences which undergo constitutive hypermutation are under the influence of 
hypermutation-recniiting elements, as described further below, which direct the 
hypermutation to the locus in question. Thus in the context of the present invention, 
target nucleic acid sequences, into which it is desirable to introduce mutations, may be 

30 constructed, for example by replacing V gene transcription units in loci which contain 
hypermutation-recmiting elements with another desired transcription unit, or by 
constructing artificial genes comprising hypermutation-recmiting elements. 
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The cell population which is subjected to selection by the method of the invention may be 
a polyclonal population, comprising a variety of cell types and/or a variety of target 
sequences, or a (mono-) clonal population of cells. 

5 

A clonal cell population is a population of cells derived from a single clone, such that the 
cells would be identical save for mutations occurring therein. Use of a clonal cell 
population preferably excludes co-culturing with other cell types, such as activated T- 
cells, with the aim of inducing V gene hyperaiutation. 

10 

Brief Description of the Tables and Figures 

Table 1 shows the results of experiments in which AID was expressed in KcolL 

15 

Table 2 shows the results of experiments in which AID and its homologues, Apobec-1, 
Apobec3C and Apobec3G were expressed in E.coIL 

Table 3 shows the results of a second set of experiments in which AID and its 
20 homologues, Apobec-1, Apobec 2, Apobec3C and Apobec3G were expressed in KcolL 

Table 4 shows the oligonucleotides used in Example 3. 

Figure Legends 

25 

Fig. 1 DNA deamination model of Ig gene diversification. For details, see text. 

Fig, 2 Expression of AID in E, coli yields a mutator phenotype that is enhanced 

by UDG-deficiency. (a) Frequencies of Rif*^ mutants generated following overnight 
30 culture {± IPTG) oiE, coli KL16 carrying either the AID expression plasmid or the vector 
control. Each point represents the mutation frequency of an independent overnight 
culture. Tte fold enhancement by AID expression is indicated (b) Mutation frequency 
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of AID- and vector-transformed, UDG-deficient KL16 ung-l cells. Performed and 
labeled as in (a), but note the differing y-axis scale, (c) Photograph of representative 
plates. The mutation frequency relative to the vector-transformed wildtype control is 
indicated in the centre of each plate. See Table 1 for additional data. 

5 

Fig. 3 Nature of the AID-induced Hif" mutants, (a) Comparison of the 

distribution of independent rpoB mutations identified in Rif^ colonies obtained firom 
AID- and vector-transformed cells. The data are combined from results obtained using 
both KL16 and AB1157 hosts, but the two hosts show no difference in their mutation 
10 spectrum. The underlined sequence is the region of rpoB which is known (Jin & Zhou 
(1996)) to harbour the majority of mutations conferring Rif^. Less than 5% of the Ri^ 
sequenced clones did not show any mutations in this region, (b) Comparisonof the types 
of rpoB nucleotide substitutions identified. 

15 Figure 4 Comparison of the independent gyrA mutations identified in Nal^ colonies 
of AID- and vector-transformed E. coli KL16. Less than 5% of the Nal^ clones analysed 
&iled to show mutations in the sequenced region. 

Figure 5 

20 (a) Frequencies of Rif^ mutants generated following overnight culture of cells carrying an 
APOBECl or AID expression construct or the vector control. Each point represents the 
mutation frequency of an independent overnight culture. The median mutation frequency 
and the fold enhancement by expression of the mutator are indicated in which AID and its 
homologues, Apobec-1, Apobec3C and Apobec3G were expressed in E.coli 

25 (b) Effect of IPTG on APOBECl -induced mutation to Rif^. The mutation observed in 
the absence of IPTG may well be due to pTrc99A promoter leakiness. Labeled as in (a). 

(c) Single amino acid changes in APOBECl abrogate its ability to stimulate mutation to 
Rif^. Labeled as in (a). 

(d) Comparison of average growth rates of vector- and APOBECl transfomied cells 
30 propagated in the presence of the inducer WIG. Five independent cultures were used for 

each measurement, but the standard deviations proved smaller than the symbols. 
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Figure 6 Spectrum of Rif^ mutations found in cells expressing APOBECl. 
(a) Comparison of the distribution of independent Rif^ mutations found in cells 
transformed with vector alone or an APOBECl expression construct. The preferred sites 
in AID-expressing cells are highlighted by dark boxes. 
S (b) Summary of the types of nucleotide substitutions in rpoB identified in Rif*^ vector- 
and APOBECl -transformed cells given as a percentage of the total database (120 fix)m 
controls and 136 firam APOBECl-transformed cells). 

Figure 7 APOBECl, AP0BEC3C and AP0BEC3G all stimulate mutation at dC/dG but 
1 0 with distinct target specificities. 

(a) Schematic of the APOBECl family of mutator proteins depicting the putative zinc- 
binding deaminase motif and the conserved leucine-rich region. Other APOBECl family 
members also contain either smgle (AP0BEC2 and AP0BEC3 A) or double (AP0BEC3B 
and APOBEC3F) putative zinc-binding motifs (Madsen et al.). AP0BEC3D and 

15 AP0BEC3E may be a single protein with two zinc-binding regions as evidenced by 
IMAGE clone 3915193 or two separate, single zinc-binding motif proteins (Jannuz et al.). 
For each protein, the enhancement of mutation to Rif*^ yielded by that protein (data firom 
Table 3), the percentage of the mutations observed that were nucleotide transitions at dC 
or dG and the identity of the major rpoB mutational hotspots observed (the percentage of 

20 the total number of rpoB mutations observed at that hotspot given in parentheses) are all 
given. The total number of mutated rpoB sequences analysed (n) for each APOBECl 
family member is given. 

(b) Distribution of rpoB mutations in Rif*^ mutants obtained using bacteria transformed 
with different APOBEC family members. There are 26 sites within the sequenced region 

25 of rpoB where a single nucleotide substitution can yield Rif*^; at 1 1 of these sites, Rif*^ can 
be achieved by a transition at dC or dO. The percentage of the total number of Rif*^ 
mutations obtained with each APOBEC family member that occurred at each of these 11 
sites is indicated. Mutations at other sites are not indicated (an omission which is mainly 
of significance to the depiction of the vector control). 



30 
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Figure 8a shows pRB700 construct comprising the Bacillus subtilis gene SacB under the 
control of the E.coli promoter for PhoB. 

Figure 8b shows the pRB740 construct comprising a variant SacB cassette under the 
5 control of the PhoB promoter and also imder the control of the strong IPTG inducible Trc 
promoter downstream and in the opposite orientation. 

Figure 9 shows the results of mutation analysis in mutants in the SacB cassette. 

10 Figure 10a shows mutation frequency in constructs when transcription is induced in 
either or both directions. 

Figure 10b shows the results of mutation frequency analysis. pRB700 and pRB740 are 
described in Figure 9. Vector control and APOBEC-1 expression plasmids pTrc99a and 

15 pRH200 are as described (Harris et al 2002 Mol Cell. 10(5): 1247-53). Growth media all 
include lOOfig/ml carbenicillin and ImM IPTG to maintain and induce expression 
plasmids. LB = Luria Bertani mediimi. Min MOPS Minimal MOPS medium 
(Neidhardt et al 1974. Culture medium for enterobacteria. J Bacteriol. 119:736-47) using 
0.1% glycerol as carbon source supplemented with 2^M Zn^^ and 0.1% casamino-acids 

20 (C, D, G, H) or bacto-peptone (I). 

Figure 11 shows a table of results for mutation analysis. 

Figure 12 shows the results of assaying for DNA deaminase activity in crude extracts 

using the TLC-based assay. 
25 A, Schematic representation of the TLC-based deaminase assay.. a-[^^P]dCMP- 

labelled single-stranded DNA was incubated with the indicated extracts, purified, digested 

with PI nuclease and analysed by TLC in one of two buffer systems. 

B. Analysis by TLC in either the LiCl [panel (i)] or CHaCOOH+LiCl [panels (ii) and 

(iii)] buffer systems of the assay products of a-[^^P]dCMP-labelled single-stranded DNA 
30 incubated with sonic extracts of E, coli transformants that carry plasmids directing the 

overexpression of APOBECl, AP0BEC2, a mutant APOBECl (harbouring an E63->A 
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substitution) or dCTP deaminase pCD). Controls are provided by extracts fiom E. coli 
transformed with vector only (-) as well as by substrate DNA that has bera subjected to 
chemical deamination using bisulfite. The plasmid/host strain combination used for 
recombinant protein expression was pTrc99/£. coli KL16 except where (as indicated) the 
5 pET vector was used (in which case the host strain was BL21DE3) or where activity was 
monitored using the E, coli S0177 host (which is deficient in both dcd and odd 
deaminases). The migration of dUMP, dCMP and inorganic phosphate (Pi) markers 
is mdicated. The abundance of wild-type and E63->A mutant APOBECl polypeptides in 
extracts was monitored by Westem [lower part of panel (iii)]. 

10 

Figure 13 shows APOBECl fractionation. 

A, Ion-exchange chromatography on Sepharose Mono-Q. Clarified lysates of 
APOBECl (and APOBECl[E63->A])-expressing£. coli were loaded onto Mono-Q. The 
presence of APOBECl polypeptide was detected by Westem blot [panel (ii)]. Deaminase 

15 activity was monitored by both TLC- and UDG-based assays [panels (i) and (iii)] in the 
total lysate (T), the flow through (FT) and in the 800 and 1000 mM-salt washes. 

B, Gel filtration of the concentrated high (>1 M) salt eluate from the Mono-Q column 
on Sephacryl S200. Fractions were analysed by: (i) SDS/PAGE; bands were excised and 
analysed by MALDI-TOF following in-gel trypsin digestion. The bands yielding peptide 

20 sequences derived from APOBECl and ribosomal proteins LI, 2, 6 and 9 and S4 are 
indicated. M, molecular weight markers, (ii) Westem blotting for APOBECl; (iii) TLC- 
based and (iv) TJDG-based deaminase assays, which were performed on samples of the 
total clarified bacterial lysate (T) as well as on the eluate from the Mono-Q. The UDG- 
based deaminase assay was performed using 3'-a-[^^P]-labelled SPM274; note that some 

25 of the 3'-label is removed during the incubation. The percentage of label associated with 
the 26-base product of the deamination/cleavage (as opposed to 40-base input 
oligonucleotide) is indicated. 
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Figure 14 shows specificity of APOBECl-mediated DNA deamination using the UDG- 
based assay. 

A, Schematic representation of the UDG-based deaminase assay. S'-biotinylated 
(circle) oligonucleotides that were 3 'labelled (asterisk) with fluorescein or a- 

5 [•'^Pldideoxyadenylate were incubated with APOBECl -containing (or control) samples 
prior to streptavidin purification, UDG-treatment and PAGE -urea analysis. 

B, Partially purified APOBECl as well as the E63->A mutant were tested for their 
ability to deaminate 3 '-fluorescein conjugated oligonucleotide SPM168 using the UDG- 
based assay. The fluorescence scan of the gel, including controls performed without UDG 

10 treatment or without APOBECl, is shown with the positions of tfie expected products and 
size markers indicated. 

C, Time-course of SPM168 deamination by partially purified APOBECl . 

A Inclusion of RNAase (1 ng) or of tetrahydrouridine (THU; 20 nmoles, 2 nmoles, 
or 200 pmoles) does not inhibit the activity of APOBECl. 
15 Deaminating activity is specific for a single-stranded substrate. The assay was 

performed using 3'-fiuorscein-labelled oligonucleotide SPM168 in the presence of the 
indicated ratio of either oligonucleotide SPM171 (which is complementary to SPM168) or 
SPM201 (which is not). 

F, Comparison of 3'-fluorescein labelled ohgonucleotides SPM168 (left three lanes) 
20 and SPM163 (right three lanes) as targets for deamination by 0.5, 1 and 2 ^1 of 

APOBECl. 

G, Comparison of 3'-a-[^^P]-labelled oligonucleotides SPM274, SPM275 and 
SPM276 as targets for deamination by 0.3, 0.6, 0.9 and 1.8 ^il of APOBECl. 



25 



Figure 15 Autoradiographs showing hybridisation of APOBECl, AP0BEC3G, and 
ubiquitin (control) probes to matched pairs of tumour (T) and corresponding normal (N) 
cDNA samples derived fix)m a variety tissues using a cancer profiling array (Clontech). 




PCT/GB03/02002 



Detailed Description of the Invention 



The fact that AID, a homologue of Apobec-1 (which deaminates C in RNA), is required 
for aU three programmes of diversification of rearranged immunoglobulin genes 
5 (Muramatsu M. et ai. (2000); Revy, P. et al. (2000); Arakawa, H. et al. (2002); Harris, 
R.S. et al. (2002) and Martin, A. et al. (2002)) and that the initiation of all three 
programmes could be explained by DNA modification at dG/dC (Martin et al. (2002), 
Maizels et aL (1995), WeiU et al. (1996); Sale et al. (2001), Ehrenstein et al. (1999) Rada 
et al. (1998) and Wiesendanger et al. (2000)) led the present inventors to the model 

10 presented in Fig. 1 . The hypothesis set out herein is that AID mediates the deamination of 
a small number of C residues within the Ig loci. Conventionally, this would trigger base 
excision repair (Lindahl T. (2000)) with uracil being removed by uracil-DNA glycosylase 
(UDG) and, following cleavage at the abasic site by an apyrimidic endonuclease (APE), a 
dC residue would be reinserted by a DNA polymerase/deoxyribophosphodiesterase. If, 

15 instead of being repaired, the DNA strand harbouring the dU residue were used to 
template DNA synthesis, then the consequence would be a dC^dT (and dG->dA) 
transition. Alternatively, if DNA synthesis occurred over the abasic site, both transitions 
and transversions woidd be generated althouglh a transition bias might still be observed if 
the polymerase used for the lesion bypass preferentially inserted dA residues. Thus, the 

20 stage at which polymerase bypass of the original lesion occurred as well as the 
preferences of the polymerase used would affect the transition bias of the hypennutation. 
This could account for the otherwise puzzling observation that whereas mutation in 
mouse and man as well as in the hypermutating Ramos B cell line exhibits a marked 
transition preference (Sale et al. (1998)), no such preference is evident in the mutations 

25 exhibited by the XRCC2-deficient chicken DT40 B cell line (Sale et al. (2001)). 

Templated repair of the deamination-induced lesion by a V pseudogene would lead to 
gene conversion; such repair would be dependent on the RAD51 paralogues XRCC2, 
XRCC3 and RAD51B (Sale et al. (2001)). The second phase of mutation (yielding 
30 mutations at dA/dT) which is observed in vivo in man and mouse would be triggered by 
MSH2/MSH6 recognition of the dU/dG mismatch itself or of some intermediate in its 
correction (Rada et al. (1998); Wiesendanger et al. (2000)), and would presumably occur 
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by some form of patch repair. Repair partnered on another switch region could lead to 
switch recombination. For switching, where there is an indication for a role of non- 
homologous end-joining (Manis et al. (2002); Peterson et al. (2001)), one might imagine 
that deamination of proximal dCs on opposite strands could generate the staggered DNA 
5 breaks proposed by Chen et al (2001). 

A central prediction of this model is that AID has the ability to trigger dC->dU 
deamination in DNA. Such an activity would presumably be largely restricted to its 
physiological target (the Ig loci) since a rampant DNA deaminase activity would likely be 
10 harmful to the cell. 

The results presented herein suggest that, whereas functional Ig genes are generated by 
RAG-mediated rearrangement, subsequent diversification is triggered by AlD-mediated 
deamination of dC residues within the immunoglobulin locus with the outcome (gene 
15 conversion, switch recombination or mutation phases 1/2) dependent upon the way in 
which the initiating dU/dG lesion is resolved. 

As well as AID, the APOBEC/AID family contains several members that are capable of 
mutating DNA, triggaing nucleotide substitutions at dC/dG by a process which, given its 
20 sensitivity to uracil-DNA glycosylase, is likely to be dC deamination. 

The physiological functions of the other APOBEC family members are unknown. 
Whereas APOBECl shows relatively restricted tissue distribution, AP0BEC3G is much 
more widely expressed- Hybridisation experiments suggest that some APOBEC family 
25 members are well expressed in a variety of cancers (Fig. 8) and cancer cell lines. 

Quite apart, however, from the normal physiological functions of the APOBEC family 
members, the fact that several of the members can display a DNA mutator activity (taken 
together with the observation that transgenic expression of APOBECl is oncogenic in 
30 mice) raises the possibility that they might contribute to the ^spontaneous' dC 
deamination that occurs in normal cells as well as the elevated mutation rates proposed to 
be associated with many human cancers. Indeed, in the large database of p53 mutations in 
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human cancers (where nearly 13,000 single base changes have been identified scattered 
over a large number of positions in the gene) over 50% of the substitution mutations (and 
over 60% of the silent mutations) are nucleotide transitions at dC/dG with roughly half of 
these dC/dGs being at dCpdG dinucleotides. 

5 

Measuring an e nhanced mutation rate in cells as an indication of a matator 
phenotvpe 

Hypemiutating cells or cells having a mutator phenotype may be identified by a variety of 
10 techniques, including sequencing of target sequences, selection for expression loss 
mutants, assay usmg bacterial MutS protein and selection for change in gene product 
activity. Methods for measuring mutation rates include fluctuation analysis (described, for 
example, by Luria and Delbreck (1943) and Capizzi and Jameson (1973)). In this, the 
generation of clones showing resistance to a selection media. Suitable selection media for 
15 prokaryotic cells include rifampicin, nalidixic acid, valine and fucose. Cells selected 
according to this procedure are cells in which mutation has occurred in a gene or genes 
which enable the effect of the selection media to be overcome. Other ways of determining 
mutation rates include direct sequencing of specific portions of DNA or indirect methods 
such as the MutS assay (Jolly et al, 1997 Nucleic Acids Research 25, 1913-1919) or 
20 monitoring the generation of immunoglobulin loss variants. 

In a preferred embodiment of the invention, the method involves generating mutations in 
a target nucleic acid which encodes an immunoglobulin. Immunoglobulin loss may be 
detected both for cells which secrete inraiunoglobulins into the culture medium, and for 

25 cells in which the immunoglobulin is displayed on the cell surface. Where tiie 
immunoglobulin is present on the cell surface, its absence may be identified for individual 
cells, for example by FACS analysis, immunofluorescence microscopy or ligand 
immobilisation to a support. In a preferred embodiment, cells may be mixed with 
antigen-coated magnetic beads which, when sedimented, will remove ftom the cell 

30 suspension all cells having an immunoglobulin of the desired specificity displayed on the 
surface. 
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The technique may be extended to any immunoglobulin molecule, including antibodies, 
T-cell receptors and the like. The selection of immunoglobulin molecules will depend on 
the nature of the clonal population of cells which it is desired to assay according to the 
invention. 

5 

Alternatively, mutations in cells according to the invention may be identified by 
sequencing of target nucleic acids, such as V genes, and detection of mutations by 
sequence comparison. This process may be automated in order to increase throughput. 

10 In a furttier embodiment, cells which hypermutate V genes may be detected by assessing 
change in antigen binding activity in the immunoglobulins produced in a clonal cell 
population. For example, the quantity of antigen boxmd by a specific unit amoimt of cell 
medium or extract may be assessed in order to determine the proportion of 
immunoglobulin produced by the cell which retains a specified binding activity. As the V 

15 genes are mutated, so binding activity will be varied and the proportion of produced 
unmunoglobulin which binds a specified antigen will be reduced. 

Alternatively, cells may be assessed in a sunilar manner for the ability to develop a novel 
binding affinity, such as by exposing them to an antigen or mixture of antigens which are 
20 initially not bound and observing whether a binding affinity develops as the result of 
hyp^mautation. 

In a fiirther embodiment, the bacterial MutS assay may be used to detect sequence 
variation in target nucleic acids. The MutS protein binds to mismatches in nucleic acid 
25 hybrids. By creating heteroduplexes between parental nucleic acids and those of 
potentially mutated progeny, the extent of mismatch formation, and thus the extent of 
nucleic acid mutation, can be assessed. 

Where the target nucleic acid encodes an gene product other than an immunoglobulin, 
30 selection may be performed by screening for loss or alteration of a function other than 
binding. For example, the loss or alteration of an enzymatic activity may be screened for. 
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Genetic Manipulation of cells 

Cells modified to express AID or its homologues are cells in which AID protein 
expression (or AID homologue protein expression) has been induced by means, for 
S example, of transfecting host cells with a vector encoding AID protein. Such transfection 
may be stable or transioit transfection. 

"'Vector'" refers to any agent such as a plasmid, cosmid, virus, autonomously replicating 
sequence, phage, or linear single-stranded, circular single-stranded, linear double- 
10 stranded, or circular double-stranded DNA or RNA nucleotide sequence that carries 
exogenous DNA into a host cell or organism. The recombinant vector may be derived 
fi-om any source. In the context of the present invention, the vector is for stable expression 
of AID and is, therefore, capable of genomic integration or autonomous replication but 
maintained throughout division cycles of the host cell. 



An expression vector includes any vector capable of expressing a coding sequence 
encoding a desired gene product that is operatively linked with regulatory sequences, such 
as promoter regions, that are capable of expression of such DNAs. Thus, an expression 
vector refers to a recombinant DNA or RNA construct, such as a plasmid, a phage, 

20 recombinant virus or other vector, that upon introduction into an appropriate host cell, 
results in expression of the cloned DNA. Appropriate expression vectors are well known 
to those with ordinary skill in the art and include those that are replicable in eukaryotic 
and/or prokaryotic cells and those that remain episomal or those which integrate into the 
host cell genome. For example, DNAs encoding a heterologous coding sequence may be 

25 inserted into a vector suitable for expression of cDNAs in mammalian cells, e.g. a CMV 
enhancer-based vector such as pEVRF (Matthias, et al., 1989). 

Construction of vectors according to the invention employs conventional ligation 
techniques. Isolated plasmids or DNA fi-agments are cleaved, tailored, and religated in the 
30 form desired to generate the plasmids required. If desired, analysis to confirm correct 
sequences in the constructed plasmids is performed in a known fashion. Suitable methods 
for constructing expression vectors, preparing in vitro transcripts, introducing DNA into 
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host cells, and performing analyses for assessing gene product expression and function are 
known to those skilled in the ait. Gene presence, ampUfication and/or expression may be 
measured in a sample directly, for example, by conventional Southern blotting, Northern 
blotting to quantitate the transcription of mRNA, dot blotting (DNA or RNA analysis), or 
5 in situ hybridisation, using an appropriately labelled probe which may be based on a 
sequence provided herein. Those skilled m the art will readily envisage how these 
methods may be modified, if desired. 

Vector-driven protein expression can be constitutive or inducible. Inducible vectors 
10 include either naturally inducible promoters^ such as the trc promoter, which is regulated 
by the lac operon, the DPTG promoter which is inducible by IPTG and the pL promoter, 
which is regulated by tryptophan, the MMTV-LTR promoter, which is inducible by 
dexamethasone, or can contain synthetic promoters and/or additional elements that confer 
inducible control on adjacent promoters. Other promoters include E.coli promoters such 
15 as PhoB. 

Methods for introducing the vectors and nucleic acids into host cells are well known in 
the art; the choice of technique will depend primarily upon the specific vector to be 
introduced and the host cell chosen. Plasmid vectors will typically be introduced into 
20 chemically competent or electrocompetent bacterial cells. Vectors can be mtroduced into 
yeast cells by spheroplasting, treatment with lithium salts, electroporation, or protoplast 
fiision. Mammalian and insect cells can be directly infected by packaged viral vectors, or 
transfected by chemical or electrical means. 

25 Methods for generating fusion proteins 

AID or any of its homologues or derivatives, including Apobec-1, may be generated as 
fusion proteins comprising the AID protein or a portion that retains its mutator activity 
coupled to a DNA binding domain or one half of a specific binding pair. Preferably the 
30 fiision protein will not hinder the mutator activity of the protein sequence. Methods for 
generating fiision proteins will be familiar to those skilled in the art and include 
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generation of expression vectors comprising the AID nucleic acid sequence linked or 
ligated to the nucleic acid sequence encoding a DNA binding domain. 

Methods for preparing and selectiiig Immunoglobttlins or other surface expressed 
5 proteins. 

The process of hypennutation is employed, in nature, to generate improved or novel 
binding specificities in immunoglobulin molecules. Thus, by selecting cells according to 
the invention which produce immunoglobulins capable of binding to the desired antigen 
10 and then propagating these cells in order to allow the generation of further mutants, cells 
which express immxmoglobulins having improved binding to the desired antigen may be 
isolated. 

A variety of selection procedures may be applied for the isolation of mutants having a 
15 desired specificity. These include Fluorescence Activated Cell Sorting (FACS), cell 
separation using magnetic particles, antigen chromatography methods and other cell 
separation techniques such as use of polystyrene beads. 

Separating cells using magnetic capture may be accomplished by conjugating the antigen 
20 of interest to magnetic particles or beads. For example, the antigen may be conjugated to 
superparamagnetic iron-dextran particles or beads as supplied by Miltenyi Biotec GmbH. 
These conjugated particles or beads are then mixed with a cell population which may 
express a diversity of surface immunoglobulins. If a particular cell expresses an 
immunoglobulin capable of binding the antigen, it will become complexed with the 
25 magnetic beads by virtue of this interaction. A magnetic field is then applied to the 
suspension which immobilises the magnetic particles, and retains any cells which are 
associated with them via the covalently linked antigen. Unbound cells which do not 
become linked to the beads are then washed away, leaving a population of cells which is 
isolated pxurely on its ability to bind the antigen of interest. Reagents and kits are 
30 available from various sources for performing such one-step isolations, and include Dynal 
Beads (Dynal AS; http://www.dynal.no), MACS-Magnetic Cell Sorting (Miltenyi Biotec 
GmbH; http://www.miltenyibiotec.com), CliniMACS (AmCell; http://www.amcell.com) 
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as well as Biomag, Amerlex-M beads and others. Similar techniques can be used for non- 
immunoglobulin surface expressed molecules where selection for their surface expression 
can be through recognition by a specific binding partner. 

5 Fluorescence Activated Cell Sorting (FACS) can be used to isolate cells on the basis of 
their differing surface molecules, for example surface displayed immunoglobulins. Cells 
in the sample or population to be sorted are stained with specific fluorescent reagents 
which bind to the cell surface molecules. These reagents would be the antigen(s) of 
interest linked (either directly or indirectly) to fluorescent markers such as fluorescein, 

10 Texas Red, malachite green, green fluorescent protein (GFP), or any other fluorophore 
known to those skilled in the art. The cell population is then introduced into flie vibrating 
flow chamber of the FACS machine. The cell stream passing out of the chamber is 
encased in a sheath of buffer fluid such as PBS (Phosphate Buffered Saline). The stream 
is illuminated by laser light and each cell is measured for fluorescence, indicating binding 

15 of the fluorescent labelled antigen. The vibration in the cell stream causes it to break up 
into droplets, which carry a small electrical charge. These droplets can be steered by 
electric deflection plates under computer control to collect different cell populations 
according to their affinity for the fluorescent labelled antigen. In this manner, cell 
populations which exhibit different affinities for the antigen(s) of interest can be easily 

20 separated fi-om those cells which do not bind the antigen. FACS machines and reagents 
for use in FACS are widely available firom sources world-wide such as Becton-Dickinsbn, 
or from service providers such as Arizona Research Laboratories 
(http://www.arl.arizona.edu/facs/). 

25 Another method which can be used to separate populations of cells according to the 
afiSnity of their cell surface protein(s) for a particular antigen is affinity chromatography. 
In this method, a suitable resin (for example CL-600 Sepharose, Pharmacia Inc.) is 
covalently linked to the appropriate antigen. This resin is packed into a column, and the 
mixed population of cells is passed over the column. After a suitable period of incubation 

30 (for example 20 minutes), unbound cells are washed away using (for example) PBS 
buffer. This leaves only that subset of cells expressing immunoglobulins which bound the 
antigen(s) of interest, and these cells are then eluted from the column using (for example) 
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an excess of the antigen of interest, or by enzymatically or chemically cleaving the antigen 
from the resin. This may be done using a specific protease such as factor X, thrombin, or 
other specific protease known to those skilled in the art to cleave the antigen from the 
colimm via an appropriate cleavage site which has previously been incorporated into the 
5 antigen-resin complex. Alternatively, a non-specific protease, for example trypsin, may 
be employed to remove the antigen from the resin, thereby releasing that population of 
cells which exhibited aiSinity for the antigen of interest. 

Insertion of heterologous transcription units 

10 

In order to maximise the chances of quickly selecting an antibody variant capable of 
bmding to any given antigen, or to exploit the AID-dependant hypermutation system for 
non-immunoglobulin genes, a number of techniques may be employed to engineer cells 
according to the invention such that their hypermutating abilities may be exploited. 

15 

In a first embodiment, transgenes are transfected into a cell according to the invention 
such that the transgenes become targets for the directed hypermutation events. 

As used herein, a "transgene" is a nucleic acid molecule which is inserted into a cell, such 
20 as by transfection or transduction. For example, a "transgene" may comprise a 
heterologous transcription unit as referred to above, which may be inserted into the 
genome of a cell at a desired location. The '*transgene" may be the nucleic acid encoding 
the gene product of interest. 

25 The plasnoids used for delivering the transgene to the cells are of conventional 
construction and comprise a coding sequence, encoding the desired gene product, under 
the control of a promoter. Gene transcription from vectors in cells according to the 
invention may be controlled by promoters derived fixjm the genomes of viruses such as 
polyoma virus, adenovirus, fowlpox virus, bovine papilloma vfrus, avian sarcoma virus, 

30 cytomegalovims (CMV), a retrovirus and Simian Virus 40 (SV40), &om heterologous 
m'anunalian promoters such as the actin promoter or a very strong promoter, e.g. a 
ribosomal protein promoter, and from the promoter normally associated with the 



wo 03/095635 PCT/GB03/02002 

34 

heterologous coding sequence, provided such promoters are compatible with the host 
system of the invention. 



Transcription of a heterologous coding sequence by cells according to the invention may 
5 be increased by inserting an enhancer sequence into the vector. Enhancers are relatively 
orientation and position independent. Many enhancer sequences are known from 
mammahan genes (e.g. elastase and globin). However, typically one will employ an 
enhancer from a eukaryotic ceU virus. Examples include the SV40 enhancer on the late 
side of the replication origin (bp 100-270) and the CMV early promoter enhancer. The 
10 enhancer may be spliced into the vector at a position 5' or 3* to the coding sequence, but is 
preferably located at a site 5' from the promoter. 



Advantageously, a eukaryotic expression vector may comprise a locus control region 
(LCR). LCRs are capable of directing high-level integration site independent expression 
15 of transgenes integrated into host cell chromatin, which is of importance especially where 
the heterologous coding sequence is to be expressed in the context of a permanently- 
transfected eukaryotic cell line in which chromosomal integration of the vector has 
occurred, in vectors designed for gene therapy apphcations or in transgenic animals. 

20 Eukaryotic expression vectors will also contain sequences necessary for the termination of 
transcription and for stabilising the niRNA. Such sequences are commonly available from 
the 5' and 3* untranslated regions of eukaryotic or viral DNAs or cDNAs. These regions 
contain nucleotide segments transcribed as polyadenylated fragments in the untranslated 
portion of the mRNA. 

25 

Transgenes according to the invention may also comprise sequences which direct 
hypermutation. Such sequences have been characterised, and include those sequences set 
forth in Klix et aL, (1998; Eur. J. Immunol. 28:317-326), and Sharpe et al, (1991; EMBO 
J. 10:2139-2145), incorporated herein by reference. Thus, an entire locus capable of 
30 expressing a gene product and directing hypermutation to the transcription unit encoding 
the gene product is transferred into the cells. The transcription unit and the sequences 
which direct hypermutation are thus exogenous to the cell. However, although exogenous 
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the sequences which direct hypennutation themselves may be similar or identical to the 
sequences which direct hypermutation naturally found in the cell 

The endogenous V gene(s) or segments thereof may be replaced with heterologous V 
5 gene(s) by homologous recombination, or by gene targeting using, for example, a Lox/Cre 
system or an analogous technology or by insertion into hypennutating cell lines which 
have spontaneously deleted endogenous V genes. Alternatively, V region gene(s) may be 
replaced by exploiting the observation that hypennutation is accompanied by double 
stranded breaks in the vicinity of rearranged V genes. 

10 

Furthennore, enhanced targeting of mutation can be achieved by inducing convergent 
promoters upstream and downstream of the desired gene and therefore inducing 
transcription in both directions. Deamination of dC in vitro by APOBEC-1 has be 
demonstrated to be dependent on the single-strandedness of the substrate oligonucleotide 

15 as described herein. The increase in availability of single-stranded DNA can be induced 
by convergent transcription or by a combination of transcription and DNA bending caused 
by promoter activation. Suitable types of promoter include the PhoB promoter. Other 
Prokaiyotic promoters include Activators (e.g. AraBAD, PhoA), Repressors (e.g. Tet, 
Lac, Tip, Hybrid Lac/Trp such as Tac, pL, Regulatable hybrids of pL such as pL-tet) and 

20 Viral Polymerase (e.g. T7). Suitable Eukaiyotic promoters include promoters recognised 
by RNA Polymerase I (e.g. 45S rDNA) RNA Polymerase H (e.g. Gal4, p-Actin or Viral, 
such as CMV-IE and Artificial, especially Tet-on, Tet-off) RNA Polymerase m )(e.g. HI 
RNA,U6snRNA). 

25 DNA binding domain and specific DNA recognition sequences 

Transcription factors bind DNA by recognising specific target sequences generally located 
in enhancers, promoters, or other regulatory elements that affect a particular target gene. 
The target sequences for a number of transcription factors are well known to those skilled 
30 in the art. Transcription factors having specific DNA target or recognition sequences 
include the yeast transcription factors such as GAL4, bacterial proteins such as the 
repressor protein Lex A and mammalian transcription factors such as estrogen receptor. 
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The DNA binding domain within such proteins serves to bind the protein to the target 
sequence or **DNA binding protein recognition sequence" and therefore bring the protein 
to a set location within a DNA sequence. 

5 One particular type of transcription factor binding site is named a "response elemenf ' 
which is a particular DNA sequence which causes a gene to respond to a regulatory 
transcription factor. Examples include the heat shock response element (HRE) and the 
glucocorticoid response element (GRE). A number of hormone response elements are also 
known to those skilled in the art. Response elements contain short consensus sequences 

10 which are the target or recognition for the DNA binding domains found within the 
corresponding inducible transcription factors such that» for example, transcription factors 
induced by a heat shock response bind HREs, glucocorticoid-induced factors bind GREs 
etc. Other examples include the binding of estrogen receptor via a DNA binding domain 
to the specific DNA binding protein recognition sequence called the ERD or estrogen 

15 response domain. The interaction of transcription factors and response elements are 
described, for example in Genes VI, Lewin, Oxford University Press, 1997. Comparisons 
between the sequences of many transcription factors suggest that common types of motif 
can be found that are responsible for binding to DNA. Such motifs include the zinc finger 
motif, the helix-tum-helix or the helix-loop-helix. Other such motifs are known to the 

20 person skilled m the art 

The interaction between a DNA binding domain and a DNA binding protein recognition 
sequence can be used to durect mutation to a specific nucleic acid sequence. One way of 
directing mutation in this way is described as follows: an expression constmct for 

25 expressing a fusion protein comprising Apobec with the estrogen recq)tor DNA binding 
domain (ERD) (Schwabe et al. Cell. 1993 Nov 5;75(3):567-78) is constructed as 
described below. The expression construct is expressed in yeast/E.Coli using standard 
transfection procedures. The yeast/E. Coli host cell is also engineered such that the 
desired target gene is also linked to a short ERD recognition sequence (Schwabe et al., 

30 1993). 
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Screening for modnlators of AID activity 

Compounds having inhibitory, activating, or modulating activity can be identified using in 
vitro assays for activity and/or expression of AID or its homologues including APOBECl, 
S APOBEC 2, APOBEC 3c and APOBEC 3G, e.g., ligands, agonists, antagonists, and their 
bomologs and mimetics. 

Modulator screening may be performed by adding a putative modulator test compound to 
a cell expressing AID (or its homologues) in accordance with the invention, and 
10 monitoring the effect of the test compound on the function and/or expression of AID. A 
parallel sample which does not receive the test compound is also monitored as a control. 
The treated and untreated cells are then compared by any suitable phenotypic criteria, and 
in particular by comparing the mutator phenotype of the treated and untreated cells using 
methods as described herein. 

15 

The invention is further described below, for the purposes of illustration only, in the 
following examples. 

20 EXAMPLES 

EXAMPLE! 

A plasmid containing a human AID cDNA expressed under control of the lac promoter 
was transformed into E. coli strain KL16 and its effect on the frequency of mutation to 
25 rifampicin-resistance (Rif^) measured by fluctuation analysis (Fig. 2 and Table 1). 

The AID expression plasmid was generated by cloning the human AID cDNA (Harris et 
al. (2002)) on an Ncol-Hindm fragment into pTrc99A (Pharmacia; gift of R. Sawa). 
coli strains KL16 (Hfr (PO-45) relAl spoTI thi-l) and its ung-l derivative (BW310) as 
30 well as AB1157 and its nfi-Iiicat derivative (BW1161) were from B. Weiss; GM1003 
(dcm-6 thr-l hisG4 leuB6 rpsL ara-H supE44 lacYl tonA31 tsx'78 galK2 galE2 xylS 
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thi'l mtl'I mug::mim-TnlO) derivatives carrying ung-1 and/or mug::mini-TnlO mutations 
were fiom A. Bhagwat. 

APOBECl and AP0BEC2 expression constructs were generated by subcloning the rat 
5 APOBECl cDNA (BamHl-Sall fragment of pSB202^^ gift from N. Navaratman and J. 
Scott) or the human AP0BEC2 cDNA {NcoVBsaM fragment from IMAGE clones 
341062). AP0BEC3C was amplified from the Ramos human Burkitt lymphoma cell line 
cDNA usmg oligonucleotides 5* NNNGAMI£AACKKnx2AACATGAATC^ and 5' 
NNlWNCTCeACGGAGACCCCTCACTGGAGA. AP0BEC3G was amplified fixim IMAGE 
10 clone 1284557 using oligonucleotides 5'-NNNGMIlCAAGGATGAAGCCTCA<nTCA(U and 
5* NNGACTSCAGCCCATCCTrCAGTTTTC^ 

The E63A, W90S and C93A substitutions in APOBECl were introduced by site-directed 
mutagenesis using the following oligonucleotide pairs: 
15 5*ACCAACAAACACGTTGcAGTCAATTTCATAGAAA/TTTCTATGAAA 
GTTGGT, 

S'ACXTGGTTCCTGTCCTcGAGTCCO'GTGGGGAG/CTCC^ 

AGGT, 

and 

20 CTGTCCTGGAGTCCCgcTGGGGAGTGCTCClVGG/CCTGGAGCACrOCC^ 
AG (substitutions in lower case). 

All constructs were verified by DNA sequencing and were identical to published 
sequences (Madsen et al. (1999), J. Invest Dennatol. 1 13, 162-169; Jarmuz et al., Anant 
25 et al. (2001), Am. J. Physiol. Cell Physiol., 281, C1904-1916) or to existing GenBank 
entries (APOBECl: NM_012907.1; AP0BEC2: NM_006789. 1). 

The plasnuds were transformed into E. coli strain KL16 and their effect on the frequency 
of mutation to rifampicin-resistance (Rif^) measured by fluctuation analysis (Tables 2 and 



30 3). 
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Mutation assays 

Mutation frequencies were measured by detennining the median number of colony- 
forming cells surviving selection per 10^ viable cells plated. Each median was 
detennined from 8-16 independent cultures grown overnight to saturation in rich medium 
S siq)plemented with lOO^g^mL caibenicillin and ImM IPTG (unless indicated otherwise). 
Ri^ and Nal^ colonies were selected on rich medium containing 100 \ig/ml rifiunpicin 
and 40 fig/ml nalidixic add respectively. Valine- and fucose-resistant mutants were 
selected on minimal M9 medium containing 0.2% glucose/40 |ig/ml L-Valine and 0.1% 
L-arabinose/0.2% D-fiicose respectively. 

10 

In multiple experiments performed in the presence of the transcriptional inducer IPTG, 
the AID-transformed cells generated Rif*^ colonies at a frequency some 4-8 fold higher 
than vector-transfonned controls. This stimulation was evident in different genetic 
backgrounds (KL16, GM1003 and AB1157), was dependent upon AID (monitored ± 
IS IPTG) and was not peculiar to the selection applied, being also clear when mutation to 
nalidixic acid (Nal)-, valine- or fucose-resistance was monitored (Fig. 2 and Table 1). 
The variation in the mutation enhancement observed m the different selections could 
reflect differences in the types and abundances of mutations that confer resistance. 

In similar multiple experiments, cells transformed with Apobec-1 generated Rif^ colonies 
at a much higher frequency (several hundred fold) than vector-transfonned controls. Cells 
transformed with other Apobec homologues, Apobec3C and Apobec3G, also showed an 
increased frequency (10-20 fold) of mutation to Rif^ compared to the vector-transfonned 
controls. 

For the experiments shown in Figure 5 and Table 3, all measurements were performed 
using KL16 or its ung-l derivative BW310 transformed with vector alone or an 
expression construct as indicated. Mutation frequencies were measured by determining 
the median number of colony-forming cells surviving selection per 10^ viable cells plated. 
Each median presented in Fig. S and Table 3 was determined from 12-16 independent 
cultures grown overnight to saturation in rich medium supplemented with 100)ig/mL 
carbenicillin and ImM IPTG (with the exception of control experiments in which the 
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inducer IPTG was omitted, Fig. 5b). DPTG-induced expression of APOBECl or its 
homologues conferred no obvious defect in cell growth or viability (e.g. APOBECl, Fig. 
5c). Rif^ mutants were selected on rich medium containing 100 ng/ml rifampicin and 
sequenced. Only about 1% of the Rif*^ colonies failed to contain mutations in the region of 
5 rpoB sequenced [nucleotides 1525-1722, numbering from the initiating ATG; GenBank 
AE000472]. 

The nature of the Ri^ and Nal^ mutants was determined by directly amplifying and 
sequencing the relevant section of the rpoB [627 bp PGR product amplified using 5'- 
10 TTGGCGAAATGGCGGAAAACC and 5'-CACCGACXK3ATACCACCTGCTG] or gyrA [521 bp 
PGR product amplified using oligonucleotides 5*-GCGCGGCTGTGTTATAATTT and 5'- 
TTCCGTGCCGTCATAGTTATC]. 

If the AID-mediated enhancement in mutation frequency is due to a stimulation of dC 
deamination, the partem of mutation to Rif^ should show a shift toward dC-^dT and 
dG->dA transitions. Sequence of the rpoB gene in multiple independent Rif^ colonies, 
revealed that this is indeed the case. Such transitions account for 79% of the mutations 
scored in the AID-transformed cells but only for 31% of the mutations in the vector 
transformed controls (Fig. 3a, b). Given the extent of mutation stunulation by AID, the 
data are consistent with the entire AID-mediated enhancement being due to transitions at 
dG/dC. A similar conclusion was obtained by examining the spectrum oigyrA mutations 
in the Nal^ colonies - despite the fact that the selected mutations £q)peared restricted to 
essentially three nucleotide positions. Thus, whereas 34% of the gyrA mutations amongst 
the control transformants are nucleotide transitions at dG/dC, the percentage increases to 
71% in the AID transformants (Fig. 4) 

It is notable that there is a striking difference in mutation distribution between the AID 
transformants and controls. Analysis of the rpoB mutations amongst the vector- 
transformed control cells reveals that dC->dT transitions at positions Ser512, Ser522, 
30 His526, Ser531, Pto564 and Ser574 can all confer Rif*^. However, transitions at only 
some of these positions (His526, Ser531 and Ser574) are enhanced amongst the AID 
transformants whereas other positions (Ser522 and Pro564) show little sign of increased 
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mutation. Even more striking is the fact that a conmion dG->dA transition in the AID 
transformants (Arg529) is not seen at all in the controls (Fig. 3). Similar evidence of 
specific targeting comes fit)m gyrA. Whereas dC->dT transitions at Ser83 and dG->dA 
transitions at Asp87 can both confer Nal^ it is the C->T transitions at Ser83 that are 
5 selectively enhanced by AID (Fig. 4). Despite this strong evidence that AID-dependent 
mutation is non-random, presumably depending upon local sequence environment, we 
cannot discriminate on these datasets whether tiiis sequence preference reflects a hotspot 
preference similar to that of the dG/dC-biased phase antibody hypermutation (Rada et al. 
(1998)) since there are only a limited number of base substitutions that can yield the 
1 0 selected phenotypes. 

If AID-induced mutations in the E. coli transfonnants are indeed occurring through 
deamination of dC, an enhancement of the effect would be expected in cells lacking 
uracil-DNA glycosylase (UDG). This is indeed the case. Although both UDG deficiency 
15 and ectopic expression of AID are sufficient in themselves to yield a mutator phenotype, 
AID expression m an ung background yielded a mutation frequency that was much 
greater than the sum of their independent mutation frequencies (Fig. 2 and Table 1). A 
similar effect was seen in E.coli expressing Apobec-1 (see Table 2). 



20 In E. coli ung-1 mutants, some back-up uracil DNA-glycosylase activity may be provided 
by the product of the mug gene (Sung et al. (2001) and Mokkapati et al. (2001)). It is 
found that whilst the AID mutator effect is not significantiy higher in a mug' than mug^ 
background, the mug mutation allows at most a slightly augmented AlD-mutator effect 
when combined with ung-1 (Table 1). If AID were to act by deaminating dG rather than 

25 dC, an increased mutation fi^uency in a background deficient in endonuclease V 
(encoded by nfi) might be anticipated since this enzyme is implicated in the repair of 
deoxyxanthosine^^"^. This does not occur; the mutation frequency displayed by AID- 
transformed nfi-l cells approximates the sum of the frequencies that are independently 
attributable to AID and nfi-l (Table 1). 

30 

The data strongly suggest that AID mediates the deamination of dC residues in the DNA. 
The homology of AID to ^obec-1 and cytidine deaminases (Muramatsu et al (1999)) 
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obviously argues in favour of a close involvement of AID in the DNA deamination 
process itself. The preferential targeting of mutation to the inmiunoglobulin loci in 
tynq>bocytes presumably depends on proteins with which ADD associates. Given that the 
cis-iegulation of both switch recombination and hypermutation is linked to the 
5 transcription regulatory elements (Manis et al. (2002); Betz et al (1994)), it would ^pear 
likely that AID is recruited either directly or indirectly by transcription- or chromatin- 
associated factors. 



APOBECl-transformed bacteria grown in the presence of the transcriptional inducer 

10 IPTG displayed massively elevated frequencies of Rif*^ mutation (Fig. 5a). This 
enhancement was confirmed by fluctuation analyses of Rif*^ mutants observed in three 
independent experiments (Table 3 top). In comparison to vector-transformed cells, the 
median enhancement by APOBECl ranged firom 440- to 700-fold (mean of 530), whereas 
that attributable to AID ranged firom 3.8 to 13-fold (mean of 7,8 in agreement with data 

15 above). The observed increases were due to APOBECl since experiments performed in 
the absence of the inducer IPTG resulted in a significantly diminished effect (Fig. 5b). 
Furthermore, single amino acid changes E63A, W90S and C93A (which are located at or 
close to the proposed 2^^^-coordination domain at the active site of APOBECl 
(Navaratnam et al) abolished the enhancement (Fig. Sc). The stimulation was not specific 

20 to the selection or the locus (Rif" mutations map largely to the rpoB gene) since it was 
also clear when resistance to nalidbcic acid was selected (due mostly to mutations in the 
gyrA gene) (Table 3 (top)). Mutation frequencies at gyrA were significantly lower than at 
rpoB; this likely reflects restrictions in numbers of base substitutions that each locus 
permits (both genes are essential and fewer sites appear mutable in gyrA). It is notable 

25 that whilst APOBECl yields a mutator phenotype in these assays as strong as that 
achieved with some of the most potent E. coli mutators (e.g. mismatch repair-defective 
strains (Schapper (1993), J. Bio. Chem. 268, 23762-23765)), tiie increased mutation load 
due to APOBECl expression caused no obvious defects in cell growth or viability (Fig. 
5d). This might reflect the nature of tiie lesions introduced by APOBECl expression. 

30 

If, like AID, the observed stimulation of mutation is due to increased dC deamination, 
then this should be apparent in tiie spectrum of Rif^ mutations - a bias toward dC/dG -> 




PCT/GB03/02002 



dT/dA transition mutations would result This was confirmed by sequencing rpoB gene 
PGR products from purified Rif*^ colonies selected from APOBECl -transformed cultures 
(as well as from AID-transformed and vector-transformed controls). By comparison with 
vector-transformed controls, APOBECl-transfbnned cells showed a dramatic shift in 
S mutation spectrum, from 27% (32/120 mutations) to 100% (136/136 mutations) transition 
mutations at dC/dG (Fig. 2). Consistent with the results presented above, AID- 
transformed E. coli gave a somewhat less dramatic shift to 82% (102/124) transitions at 
dC/dG (Fig. 7) reflecting the fact that AID, at least m this system, is a less potent mutator 
than APOBECl. 

10 

The mutation spectra revealed striking local differences between vector-, APOBECl- and 
AID-transformed cells with respect to the specific dC/dG pairs targeted. Whereas, in 
keeping with AID, the majority of dC/dG to dT/dA transitions in rpoB in AID- 
transformed cells clustered at CI 576 (45/124 mutations) and G1586 (23/124 mutations), 
15 those in APOBECl -transformed cells showed a quite distinct distribution with major 
hotspots at C1535 (39/136 mutations) and C1592 (74/136 mutations) (Fig. 6a and Fig. 7). 
Thus, the entire enhancement of RiJ^ mutation firequency observed in APOBECl- 
transformed cells occurs via transitions at dC/dG base pairs but with the local targeting 
specificity being remarkably dififerent fiom that of AID. 

20 

The different local targeting specificities of APOBECl and AID strongly argues that both 
proteins are involved in a dC deamination process, generating dU/dG lesions in DNA. 
Given this likely mode of action, one would expect that the stimulation of mutation by 
APOBECl (like that by AID) would be enhanced in cells lacking uracil-DNA glycosylase 

25 (UDG), an enzyme that specifically recognises dU in DNA and initiates base excision 
repair of dU/dG lesions (Lindahl). UDG-deficiency (ung-l) and APOBECl expression by 
themselves enhance mutation about 10- and 500-fold, respectively. APOBECl 
expression in UDG-deficient cells further increases levels of mutation to about 2600-fold 
above vector-transformed ung^ cells, a much more than additive effect demonstrating that 

30 APOBECl is capable of triggering dU/dG lesions (Table 3 top). Despite the additional 
mutation load in cells, sequence analysis of the mutations conferring Ri^ revealed 
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that, as for AID, the mutational targeting by APOBECl in an ung-l background was 
essentially the same as in ung^ cells (data not shown). 

At least six other APOBECl-like proteins exist in humans (Madsen et al; Jarmuz et al; 
S Anant et al). APOBEC2 (also called APOBECl-related cytidine deaminase-U ARCD-1) 
is found on chromosome 6p21.1 and the others, termed AP0BEC3A throug(h 
AP0BEC3G (also t&cmed phorbolins or ARCDs) are encoded on chromosome 22ql2- 
ql3. They all contain a region homologous to the putative Zn^'*^-binding cytidine 
deaminase motif of APOBECl. This suggested that the mutator activities of these 

10 proteins might also be conserved and prompted us to ask whether these homologues might 
also work on DNA. Expression of AP0BEC3C and APOBEC3G, representative 
members of the chromosome 22 cluster (Fig 7a), but not AP0BEC2, triggered increases 
in the frequencies of mutation to Rif*^ and Nal^ in E. coli (Table 3 bottom). The 
stimulation of mutation by APOBEC3G is signiiScantly greater when monitored by the 

15 frequOTcy of Rif*^ rather than Nal*^ clones (Table 3). This may reflect the relatively strong 
target preference of AP0BEC3G (see below) taken together with the fact that there are 
many more dC/dG targets in rpoB than in gyrA that can confer resistance to the relevant 
antibiotic. The mutation frequencies to Rif^ achieved with both AP0BEC3C and 
APOBEC3G was also further elevated in an img-l background indicating that they, like 

20 APOBECl and AID, potentiate dU/dG mispairs, substrates for UDG and subsequent 
repair (Lindahl). In contrast, cells transformed with a human AP0BEC2 expression 
constroct showed neither increased mutation frequencies {wig^ or ung' backgrounds; 
Table 3 bottom) nor a significantly altered rpoB mutation spectrum (data not shown). 



25 That AP0BEC3C and AP0BEC3G also act like APOBECl and AID is supported fiuther 
by a near complete shift in the spectrum of mutations that yield Rif*^, from 27% (32/120) 
dC/dG -> dT/dA transitions in vector-transformed cells to 94% (102/108) and 88% 
(81/92) in AP0BEC3C- and APOBEC3G-transformed cells respectively (Fig. 7). 
Moreover, a direct comparison of the dC/dG base pairs targeted by AP0BEC3C and 

30 AP0BEC3G with those targeted by APOBECl and AID revealed obvious biases, the 
most striking of which was mutation of C1691 by AP0BEC3G (71/92 mutations 
compared to 4/120 for vector-transformed cells). AP0BEC3C, on the other hand, shared 
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one hotspot with APOBECl (44/108 at C1535) and another with AID (23/108 at C1576), 
and appeared to be slightly more promiscuous causing dCVdG -> dT/dA transition 
mutations at eight positions in rpoB (Fig. 7b). 

5 EXAMPLE 2 

APOBECl expressed in E.coli can be used to mutate a heterologous gene integrated 
into the chromosome. 

10 The Bacillus subtilis gene SacB is toxic to E. coli m ttie presence of sucrose. SacB is 
cloned under the control of the E. coli promoter for PhoB and the cassette integrated into 
the chromosome of E, coli strain DHlOb at the Lambda phage attachment site using 
pRBTOO (Figure 8a), a derivative of the CRIM system plasmid pSK50A-uidA2 
(Haldimann et al 1996 Proc Natl Acad Sci U S A. 93(25):14361-6., Haldimann and 

15 Wanner 2001 J BacterioL 183(21):6384-93). The PhoB promoter is active under 
conditions of low inorganic phosphate availability. Thus mutants in the SacB cassette can 
be selected by growing independent colonies transfected with either an APOBEC-1 
expression construct or a control plasmid to saturation (using one fortieth of the colony as 
the inoculum) and plating on minimal MOPS medium containing S% suciose and limiting 

20 phosphate. 

PGR and subsequent sequencing of the integrated SacB genes in these sucrose resistant 
colonies demonstrates that spontaneous mutants at this locus arise primarily by transposon 
insertion (and therefore generate a significantly larger than expected PGR product). This 
25 accounted for 13/16 spontaneous mutations. In contrast, point mutations predominate 
when APOBEC-1 is expressed. Furtheraiore, these point mutations are overwhelmingly 
(32/33) transitions at C and G, consistent with these mutations arising by deamination of 
dG as expected (Figure 9). 

30 Enhanced targeting of mutation can be achieved by inducing convergent promoters 
upstream and downstream of the desired gene. 
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The dependence of mutation caused by AP0BEC4 at this locus on transcription is 
investigated APOBEC-1 increases the mutation frequency at SacB approximately 12« 
fold when colonies are grown in rich medium, and growth in medium containing limiting 
phosphate does not appear to enhance mutation at this locus. To investigate the 
S possibility that transcription in both directions might be required to show an increase in 
mutation frequency, a variant SacB cassette in pRB740 is created, under the control of tiie 
same PhoB promoter, additionally placing the strong IPTG inducible Trc promote 
downstream of the SacB gene in the opposite orientation (Figure 8b). 

10 The mutation frequency following growth in rich medium with DPTG of SacB in this case 
is comparable to that of the original cassette without the convergently orientated Trc 
promoter, and so is the spectrum of point mutations obtained (20/20 are transitions at C or 
G), indicatmg that this variant SacB cassette does not mutate with an appreciably higher 
frequency when transcription is induced only in the antisense direction. 

15 

However, following growth in limiting phosphate together with IPTG, the mutation 
frequency is enhanced approximately 1000 fold above that achieved either by APOBEC-1 
without the downstream promoter under the same conditions, or with the downstream 
promoter in rich medium (Figure 1 Oa, b). 

20 

Thus, activation of convergent promoters located on opposite sides of the gene is able to 
enhance APOBEC-1 induced mutation at that locus very appreciably. Furthermore, 
expression of the less mutagenic APOBEC fiamily members AID and APOBEC-3G under 
these conditions of convergent transcription also gives rise to a significant increase in 
25 mutation frequency above background and a shift towards the expected PCR product size, 
indicating that transposon insertions are responsible for a lower proportion of the 
observed mutants. The expected PCR size product is obtained in 10/10 and 7/10 cases for 
AID and AP0BEC3G respectively, compared to only 2/10 and 5/10 respectively under 
non-transcribing conditions (Figure 11). 

30 

Under conditions of bi-directional transcription, transitions at C or G account for 8/8 and 
5/5 of the point mutations observed for AID and AP0BEC3G respectively (Figure 11). 
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Taken together, these results demonstrate that targeted deamination by members of the 
APOBEC family can be achieved if the desired gene to be targeted is placed between 
convergent promoters (Figure 1 1). 

S EXAMPLE 3 

Deanunation of cytosine to nractt in DNA can be achieved in vitro using partially 
purified APOBECl from extracts of transformed Escherichia coli, 

Plasmids and bacteria 

10 The pTrc99- and pET-based expression vectors for rat APOBECl and its E63->A mutant, 
for human APOBEC2 and for E. coU dCTP deaminase as well as the E. coU host strains 
have been described previously (Rada et aL (2002) Curr. Biology 12, 1748-1755, 
Randerath K., and Randerath E. (1967) Method Enzymol 12, 323-347). The pTic99- and 
pET-based vectors differ both in the nature of the promoter used (pTrc99 uses the trp/lac 

15 hybrid promoter whereas pET uses the T7 promoter) and in the length of heterologous 
peptide linked to the amino-terminus of the recombinant protein (9 amino acid with 
pTrc99 but 34 amino acid with pET (Rada et al. (2002) Curr. Biology 12, 1748-1755., 
Randerath K., and Randerath E. (1967) Method EnzymoL 12, 323-347)). 

20 Oligodeoxvribonucleotides 

The oligodeoxyribonucleotides used are listed in Table 4. 

Preparation of recombinant APQBEn 

A 2 ml overnight culture of a fiesh E.coli transformant grown in LB, 0.2% Glucose, 50 
25 ng/ml carbenicillin was diluted into 300 ml of the same medium and grown at 37'*C to an 
A^oo of 0.8. The culture was chilled on ice for 20 min and then incubated with aeration 
for 16 h at 16'C in the presence of inducer (1 mM IPTG). Cells were harvested by 
centrifugation, washed and resuspended in 20 ml H buffer (50 mM Tris-HCl, pH7.4, 50 
mM KCl, 5 mM EDTA, 1 mM DTT and a protease inhibitor cocktail [Roche]). 
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Following sonication and ultiacentrifugation (100,000g for 4S min), the supernatant was 
passed through a 0.2 )im filter and applied to a Sepharose Fast-Flow Mono-Q colmnn 
(Amersham Biosciences; 10 ml bed volume). After washing with seven colunm volumes 
of buffer H, bound proteins were eluted in buffer H supplemented with increasing salt 
5 concentrations (fiom SO to 1500 mM CY) collecting IS ml fractions. Fractions and flow- 
through were concentrated one-hundred fold using VivaSpin concentrators (Mr 10,000 
cut-off) (VivaScience) and assayed. Samples eluting with 1000-1500 mM salt were 
pooled and loaded in a volume of 0.5 ml onto a HighPrep Sephacryl S-200 High- 
Resolution 16/60 gel-filtration column (Amersham Biosciences) in buffer H. Fractions (1 
10 ml) were collected and concentrated twenty fold before analysis. 

TLC-based deaminase assav 

Samples (2-4 ^1) were incubated at 3TC for 5 h in 20 ^1 of buffer R (40 mM Tris pH 8, 
40 mM KCl, SO mM NaCl, 5 mM EDTA, 1 mM DTT, 10 % glycerol) containing 75,000 

15 cpm of a-pP]dC-labelled single-stranded DNA (prepared by a 3 min heating to 95 of 
the products of asymmetric PGR amplification of the lad region in pTrc99 performed 
using a-[^^P]dCTP (3000 Ci/mmol)). Following phenol extraction and ethanol 
precipitation, the DNA was digested with Penicillium citrinum PI nuclease (Sigma) 
overnight at 37*C (Grunau C, Clark S.J., and Rosenthal A. (2001) Nucleic Acids Res. 29, 

20 E65) and the PI digests then subjected to thin layer chromatography on PEI-cellulose in 
either (i) 0.5 M LiCl at 4^ or (ii) at room temperature in 1 M CH3COOH until the buffer 
front had migrated 2.5 cm and then in 0.9 M CH3COOH:0.3M LiCl (Cohen, R.M., and 
Wolfenden, R. (1971) /. Biol Chem. 246, 7561-7565). Products were detected using a 
phosphorimager. Chemical deamination of cytosine in DNA using bisulfite/hydroquinone 

25 was performed as described (Yamanaka et al. (1995) Proc. Nat Acad. Sci. USA, 92, 
8483-8487). 
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UDG-based deaminase assay 

Samples (1-2 ^1) were incubated at 37*C for 2 h in 10 nl of buffer R with 5'-biotinylated 
oligonucleotides that either were synthesized with fluorescein at their 3'-ends (3 pmol of 
oligonucleotide per reaction) or were 3*-labelled by ligation with a-[^^P]dideoxyadenylate 
5 (100,000 cpm; 0. 1 pmol) using temiinal deoxynucleotidyl transferase. 

Reactions were terminated by heating to 90"C for 3 min and oligonucleotides purified on 
streptavidin magnetic beads (Dynal), washing at 72'*C (except in Fig. 2A, where the 
streptavidin purification step was omitted). Deamination of qrtosine in the 
10 oligonucleotides was monitored by incubating the bead-immobilised oligonucleotides at 
37'C for 30 min with excess uracil-DNA glycosylase (0.5 units UDG; enzyme and buffer 
from NEB) and then bringing the sample to 0.1 5M in NaOH and incubating for a further 
30 min. The oligonucleotides were then subjected to electrophoresis on 15% PAGE-urea 
gels which were developed by either fluorescence detection or phosphorimager analysis. 

15 

Western blotting 

Western blot detection of APOBECl following SDS/PAGE of samples that had been 
diluted 20-100 fold was performed using a goat-anti-APOBECl serum (Santa Cruz 
Biotechnology), developing wifli horseradish peroxidase-conjugated donkey anti-goat 
20 immunoglobulin antiserum (Binding Site, Birmingham, UK). Low-range molecular 
weight markers were fi-om BioRad. 

RESULTS 

DNA deaminat ion assav in cell extracts 
25 Since, of all the APOBEC family members tested, APOBECl displayed the most potent 
mutator activity in the E, coli mutation assay (Randerath K. and Randerath E. (1967) 
Method Enzymol 12, 323-347), APOBECl -transformed £. coli were investigated in order 
to see if DNA deamination activity in vitro using cell extracts could be detected. 
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Initially, the UDG-based deaminase assay was tried, working with an 
oligodeoxyribonucleotide substrate. However, no evidence of deamination was obtained 
using double-stranded oligonucleotide substrates whereas single-stranded 
5 oligonucleotides were rapidly degraded by both APOBECl and control extracts (data not 
shown). The possibility that the DNA deaminating activity might be specific for single- 
stranded substrates but that this activity might be masked by non-specific nucleases was 
investigated. An assay that would be less sensitive to contaminating nucleases (Fig. 12^4) 
was devised. 

10 

The bacterial extracts were incubated with a-[^^P]dC-labelled single-stranded DNA 
which was then purified, digested with nuclease PI and subjected to thin-layer 
chromatography to test for the presence of a-[^^P]dUMP. Clear evidence of dC 
deamination in this assay was detected using extracts of coli expressing two different 

15 APOBECl constructs but not firom control extracts or &om extracts made ftom E, coli 
cells canymg plasmids expressing mutant APOBECl, APOBEC2 or dCTP deaminase 
(none of which fimction as DNA mutators in the bacterial assay (Randerath K. and 
Randerath E. (1967) Method Enzymol 12, 323-347)) (Fig. 12B). The DNA deaminase 
activity was evident in APOBECl-transformants of a mutant E, coli deficient in both dcd- 

20 and crfrf-encoded deaminases (Fig. 125 (Hi)). That the product of APOBECl action was 
indeed dUMP is indicated by the co-migration of the radioactive product with dUMP in 
two distinct buffer systems. 

These results suggested fi:actionation of the extracts of APOBECl -transformed E. coli to 
25 see if the DNA deamuiation activity could be sufficiently separated from non-specific 
nucleases so as to be detectable using the oligonucleotide cleavage assay. 



Partial purification 
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Pilot experiments revealed that ion-exchange chromatography could be used to obtain 
samples of APOBECl that contained diminished non-specific nuclease activity. Thus, 
whilst oixly a proportion of the APOBECl polypeptide bound to the Mono-Q column 
(around 10-20% based on ECL quantitation of the Western blot assay), elution of this 
S bound fraction with > 0.8 M Cr yielded a sample that displayed cytosine-DNA 
deamination activity (as monitored using the TLC-based assay) but containing diminished 
non-specific nuclease activity in the UDG-based assay (Fig. 13^). These fractions were 
then concentrated and subjected to gel filtration (Fig. 13^). The major APOBECl peak 
eluted in fi:actions 7-9 (corresponding to an Mr of 95-140,000) co-eluting with peak DNA 

10 deaminating activity. Indeed, with these fractions from the gel filtration column, DNA 
deamination could now readily be detected by the UDG-based assay using a single- 
stranded oligonucleotide substrate (although the peak fractions also contained activity that 
removed the 3'-label from the oligonucleotide). Mass spectrometric analysis of proteins 
in fraction 9 following SDS/PAGE revealed the recombinant APOBECl migrating at the 

IS position marked by the asterisked in Fig. l3B(i) although the majority of the bands 
derived &om ribosomal proteins. 

Characteristics of the DNA deaminating activitv 

The UDG-based deaminase assay was used to monitor the specificity and characteristics 
20 of the partially purified APOBECl (Fig. 144). Samples were incubated with a single- 
stranded oligodeoxyribonucleotide (with or without its complement) which contained 
intemal dC residue(s) and that was 5*-biotinylated as well as 3 '-labelled. After 
purification on streptavidin, the oligonucleotide was treated with UDG (plus alkali), 
resultmg in site-specific cleavage if the oligonucleotide had been subjected to dC->dU 
25 deamination. Thus, deamination is read out by the appearance of the specific cleavage 
product following PAGE-urea analysis. 
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The partially-purified wild type protein (but not the E63->A mutant) showed clear activity 
on a single-stranded oligonucleotide with the cleavage being dependent on the subsequent 
incubation with UDG (Fig. 145. The deaminating activity was not inhibited by 
tetrahydrouridine (which inhibits cytidine deaminases (Frederico et al. (1990) 
5 Biochemistry, 29, 2532-2537)) or by RNAse (Fig. 14/)). Strikingly [and consistent with 
our inability to detect deamination on double-stranded oligonucleotide substrates using 
crude extracts of bacterial transformants (see above)], the activity was blocked if a 
complementary (but not if an irrelevant) oligonucleotide was titrated into the assay (Fig. 
Examination of the cleavage products generated in the UDG-based assay suggests 

10 that not all dC residues are equally susceptible to APOBECl -mediated deamination. It is 
clear, for example, that in oligonucleotide SPM168 the third cytosine in the sequence 
TCCGCG is much less favoured than the other two (Fig. 14B'E). Similarly, evidence of 
specificity comes from comparing various related oligonucleotides as substrates, where all 
the data taken together point to deamination bemg especially disfavoured when a purine is 

15 located immediately 5* of the cytosme (Fig. 14F, G). 

DISCUSSION 

The results described here provide biochemical evidence that APOBECl -mediated 
deamination of cytosine to uracil can occur on single-stranded DNA, is dependent on 

20 local sequence context and is abolished by mutation of the APOBECl zinc-coordination 
motif Unlike ADD (where genetic evidence indicates that the natural physiological 
substrate of deamination is DNA (Harris et al. (2002) Mol. Cell 10, 1247-1253, Wagner et 
al. (1989) Proc. Nat. Acad. Sci. USA, 86, 2647-2651), the major physiological substrate 
of APOBECl is clearly j^ohpoprotein B RNA (Teng et al. (1993) Science 260, 1816- 

25 1819, Blanc, V. and Davidson, N.O. (2003) J. Biol. Chem. 278, 1395-1398). 
Nevertheless, the observation that misexpression of APOBECl in transgenic mice 
predisposes to cancer suggests that APOBECl -mediated DNA deamination could well be 
of pathological relevance. 
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Given the abundance of APOBECl polypeptide in the peak fi:action Scorn the gel filtration 
colxinm, it appears that - on average - each molecule of recombinant APOBECl is 
responsible for in the order of a single deamination event in a 10 minute incubation in the 

5 UDG-based assay. Crude calculations indicate that if the ~ 500 molecules of APOBECl 
expressed in each E. coli transformant displayed a DNA deamination activity of this order 
in vivo and if this were targeted randomly to all cytosine residues in the genome, then this 
could, in principle, be more than sufficient to account for the several thousand-fold 
enhanced mutation firequencies seen at the rpoB and other loci in UDG-deficient E. coli 

10 following 20 generations of growth (Randerath K. and Randerath E. (1967) Method 
Enzymol 12, 323-347). Similarly, somatic hypermutation of immunoglobulm variable 
genes by targeted AID-mediated dC deamination may involve a single and most probably 
less than ten targeted dC deamination events in each B lymphocyte cell cycle. 

15 The results provide information about the preferred target of APOBECl -mediated DNA 
deamination. The in vitro assay reveals a clear sensitivity to the local sequence context of 
the dC residue to be deaminated. The results obtained here suggest there may be bias 
against a 5*-flanking piuine residue. This would accord well with the in vivo data where a 
near-total restriction to mutation at dC residues with a 5*-flanking pyrimidine is seen at 

20 the rpoB locus (Randerath and Randerath). 

The in vitro assay also reveals that APOBECl deamination is targeted to single-stranded 
DNA and, indeed, was undetectable on double-stranded DNA. This specificity for single- 
stranded DNA is in accordance with the fact that the natural substrate of APOBECl is 
25 most likely single-stranded RNA (Blanc and Davidson) and, presumably, the same active 
site in APOBECl is used for both types of polynucleotide. Furthermore, spontaneous 
deamination of c>^osine is also much more rapid in single- (as opposed to double-) 
stranded DNA which may explain the correlation with transcription of the DNA target 
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gene described herein and where convergent promoters increase the availability of single- 
stranded DNA to APOBEC-1 . 

EXAMPLE 4 - Expression of Apobec-1 fusion proteins 

5 

The Apobec-1 expression plasmid was generated as described above but a nucleic acid 
encoding rat Apobecl with an aminoterminal fusion encoding : 

Met-Hs-ffis-ffis-ffis-ffis-ffis-His-His-Tyr-Asp-ne-Pro-Thr-Al^^^ 
10 -Tyr-Phe-Gln-Gly-Ser-joining to the initiator Met of Apobec-1 

The expression construct was expressed from in E. coli strain BL21 DE3 (purchased 
from Novagen) and the effect on the frequency of mutation to rifampicin-resistance (Ri^) 
measured by fluctuation analysis as described above. 

15 

The results are as follows: 
Rif R colonies 

vector alone 42 35 28 23 
20 His-Apobec-1 3000 3000 2000 1500 

(The numbers are numbers of Rifr colonies in 4 independent experiments, the 
experiments being performed as in Tables 1 and 2). 

This demonstrates that the Apobec fusion protein with a His-tag fused to its N-terminus 
25 retains mutator activity in E. coli. 

EXAMPLE 5 - Hybridisation experiments 

A cancer profiling array was obtained from Clontech (Cat No. 7757-1) and hybridised as 
30 directed with the following ^^P-dCTP-labeled human cDNA probes: APOBECl (MAGE 
clone 2107422), AP0BEC3G (IMAGE clone 1284557) and ubiquitin (control provided 
with array). The array was hybridised first with APOBECl, subsequently with 
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AP0BEC3G, and finally with ubiquitin. After each hybridisarion the probe was removed 
by boiling in 0.5% SDS. Hybridisation images, shown in Figure 15, were visualised with 
the Typhoon Phosphoimaging System (Pharmacia) and ImageQuant software. Data are 
grouped by tissue to facilitate comparison, althougji the entire blot (representing all 
5 tissues shown) was hybridised simultaneously as a single filter in each experiment (i.e. 
with each probe) and the autoradiographic image subsequently separated by computer 
manipulation (without adjusting gain or background). 

Results 

10 APOBECl expression appears to be restricted to gastrointestinal tissues (colon, stomach, 
rectum, and small intestine), whereas AP0BEC3G was expressed to some extent in all 
tissues examined. Perhaps most notable is the fact that for some tumour samples, 
APOBECl (colon and rectum) and AP0BBC3G (breast and kidney) appear better 
expressed than in corresponding nomial tissues (only intra-hybridisation pairs should be 

15 considered). Note also that for APOBECl hybridisation of stomach samples tiie opposite 
mftybe thecase. 
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All publications mentioned in the above specification, and references cited in said 
publications, are herein incorporated by reference. Various modifications and variations 
of the described methods and system of the present invention will be apparent to those 
skilled in the ait without departing from the scope and spirit of the present invention. 
S Although the invention has been described in connection with specific prefoxed 
CTibodiments, it should be understood that the invention as claimed should not be unduly 
limited to such specific raibodiments. Indeed, various modifications of the described 
modes for carrying out the invention which are obvious to those skilled m molecular 
biology or related fields are intended to be within the scope of the following claims. 
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Claims 

L A cell modified to express AID, or an AID variant, derivative or homologue, and 
S having a mutator phenotype. 

2. A cell as claimed in claim 1 said cell being prokaryotic. 

3. A cell as claimed in claim 1 said cell being eukaryotic. 

10 

4. A cell as claimed in any of claims 1 to 3 wherein the AID homologue is Apobec-1, 
ApobecSC or ApobecSG. 

5. A fusion protein comprising an AID, or AID variant, derivative or homologue, 
15 polypeptide having a mutator phenotype operably linked to one half of a specific binding 

pair. 

6. A fiision protein as claimed in claim 5 wherein said one half of a specific binding pair 
is a DNA binding domain* 

20 

7. A vector for expressing a fusion protein as claimed in claim 5 or claim 6. 

8. A cell modified to express a fiision protein as claimed in claim 5 or claim 6. 

25 9. A method for preparing a gene product having a desired activity, comprising the steps 
of: 

a) expressing a nucleic acid encoding the gene product in a population of cells according 
to claim 1 or claim 8; 

b) identifying a cell or cells within the population of cells which expresses a mutant gene 
30 product having the desired activity; and 
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c) establishing one or more clonal populations of cells from the cell or cells identiJBed in 
step (b), and selecting from said clonal populations a cell or cells which expresses a 
gene product having an improved desired activity. 

5 10. A method as claimed in claim 9 wherein the nucleic acid encoding the gene product is 
operably linked to the second half of a specific binding pair. 

1 1. A method for directing mutation to a specific gene product of interest comprising the 
steps of: 

10 i) gmerating a nucleic acid construct conqjrising a nucleic acid sequence encoding a 
specific gene product op^:Bbly linked to a DNA binding protein recognition sequence; 

ii) transfecting said nucleic acid construct into a population of host cells expressing a 
fusion protein in accordance with claim 6; 

iii) incubating said transfected host cells under conditions suitable for allowing the 
15 specific binding pairing of DNA binding protein to DNA binding protein 

recognition sequence to occur; 

iv) identifying a cell or cells within the population of cells which expresses a mutant 
gene product having the desired activity; and 

v) establishing one or more clonal populations of cells firom the cell or cells 
20 identified in step iv), and selecting from said clonal populations a cell or cells 

which expresses a gene product having an improved desired activity. 

12. A method of identifying components of AID-dependent mutation activity comprising 
expressing AID in a cell deficient in expression or activity of a known gene and assessing 

25 mutator activity compared to activity in a cell expressing said goie. 

1 3. A method of screening for a modulator of AID activity comprising: 

a) expressing AID in a prokaryotic cell; 

b) maintaining the AID-expressing prokaryotic cell in the presence of a selectable 
30 medium; 

c) detecting the presence of in the absence or presence of a test compound wherein a 
modified number of colonies when compared to a sample in the absence of a test 
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compound is indicative of the ability of the test compound to modify AID mutator 
activity. 

14. Use of AID or a functional homologue thereof in inducing mutation in a cell. 

5 

15. Use of an agent that modifies AID functional activity or gene expression in the 
manu&cture of a medicament for use in a method for treating a disorder characterised by 
increased mutations* 



10 16. A method of decreasing hypemutation/tesistance to a compound such as an antibiotic 
in a population of bacteria by modulating activity of a bacterial AID homologue. 

17. A construct for use in a method as claimed in any of claims 9 to 1 1, said construct 
comprising a coding sequence for the gene product of interest wherein said coding 
15 sequence is placed under the control of a first promoter^ upstream of the coding sequence 
and further comprising a second promoter downstream of the coding sequence, wherein 
said first and second promoters are arranged in opposing orientation so as to allow 
convergent transcription of the coding sequence. 



20 
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Table 1. AID expression stimulates mutation in R colU 



Mutation frequency 
(xlO-*) 



Relevant Fold enhance- 

Strain genotype Selection (-)AID (+)AID ment by AID 



KL16 mg^ Rifampidn 14 85 6.1 

11 62 S.6 

41 180 4.4 

Nalidixic acid 1.8 5.2 2.9 

10 35 3.5 

Valine 9.4 76 8.1 

2.0 90 45 

Fucose 17 31 1.8 

9.6 39 4.1 

mg-1 Ri&mpicin . 100 1100 11 

78 1800 23 

110 960 8.7 

Nalidixic acid 9.7 62 6.4 

29 620 21 

Valine 5.2 630 120 

4.5 310 69 

Fucose 42 260 6.2 

24 240 10 

GM1003 ung-1 mug* Rifampicin 29 730 25 

ung*mug Rifampicin 18 110 6.1 

mg-1 mug Ri&nq)icin 23 810 35 

AB1157 vfi*^ Rifampicin 24 100 4.2 

21 110 5.2 

nfi'l Rifempicin 36 150 4.2 

40 160 4.0 
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Table 2. AID and its homologoes stimulate mutation in R coU 



Rif^ mutation frequency 
(xlO-^* 



Protein 
expressed 


Genetic 
background 


(-)AID 
or 

homologue 


(+)AID 
or 

homologae 


Fold increase 
dependent upon 
AID or 
homologue 


AID 


ung^ 


33 


160 


4.8 








180 


5.5 


APOBECl 


ung^ 


33 


12000 


360 






41 


16000 


390 




ung'l 


260 


140000 


540 


AP0BEC3C 


ung^ 


29 


230 


7.9 


AP0BBC3G 


ung^ 


33 


210 


6.4 



° Mutation fi:equency is expressed as the median number of colony-foiming ceUs 
surviving selection per 10^ viable cells plated. Each median was based on 8-16 
indqiendent cultures grown to saturation in rich medium supplemented with ImM IPTG 
and lOOfig/mL caibenicillin (Sigma). 
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Table}: Median mutatioa frequencies (x 1 0' V and average fold enhaacemeat 
for three indqiendent e7q)eriments and two selective oondidoos. 



Vector 



APOBECl 



ADD 



APOBEa 



APOBEC3C 



AP0BEC3G 



Rif 


Nal* 


tmg * 




* 1 ur^-l 


13 

34 1 

29 


180 

360 10 
160 


4.8 
6 1 

5 


29 

32 4.8 

16 


9100 

15000 530 
13000 


35000 

100000 2600 
64000 


1900 

1800 330 
1400 


5400 

940O 1300 
5400 


170 

220 7.8 
110 


2100 

3600 120 
2900 


14 

13 2.4 

9.8 


210 

160 34 
160 


21 

35 1.1 

20 


100 

400 8.8 
190 


6.2 

5.8 1 
4.3 


32 

. 18 4.6 
21 


250 

440 20 

190 


3600 

14000 260 
2900 


49 

35. 7.5 
34 


660 
3200 270 
710 


230 

400 18 

180 


1200 

3000 120 
1200 


25 

18 4.1 

20 


45 

41 8J 
44 1 



"^Each median mutatioa frequency was determined from 12-16 independent cultures 
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Figure 3 
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n=50 
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Figure 4 
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Figure 5 
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